I was chatting to some of my colleagues at work recently about the potential benefits of a JavaScript compiler vs a C compiler, when targeting an embedded MCU. A concern that came up multiple times regarding different features, is
…but you can already do that in C, so how is JS better?
This is a good point. If you’re given a problem, you can solve it in JS, or you can solve it in C (or C++). Is JavaScript really any better? Of course, you know my answer, which is probably pretty biased. But let’s dig into a few of the details.
Firstly, let’s put aside the question of performance for a moment. I argue that JS is going to yield a more performant program1, while I’m sure that most C programmers will argue that C will perform better. Probably both are right, under different circumstances. Let’s just side step this issue for the moment.
There is at least one very significant reason why I think JS has a lot more to offer than C or C++, and that’s the ability to write reusable code. This is the gateway feature that opens the door to a 10x improvement in productivity, because it means that you can leverage highly customizable libraries to do most of the heavy lifting, rather writing the code yourself. Yes, in C and C++ you can use third party libraries. But in JS you can do it an order of magnitude faster. There are many, many times, where it’s taken me literally less than 5 minutes to go from “I wonder if there’s a library that already does this”, to using the library and moving on to the next task.
The same cannot be said for any third party C or C++ library or code file that I’ve ever used, which typically take days or weeks of integration work, reading documentation, figuring out why it doesn’t compile2, and then debugging cryptic issues.
So why is this the case? Is it the package managers? Is it the community? Is it the language?
I’m guessing it’s multiple reasons, but I’m going to focus on one in particular in this post: binary coupling (at least that’s what I’m calling it).
An Example: a Sum Function
Let’s take a dummy example. Consider a function that adds together an collection of numbers (sound familiar?). In JavaScript, you can be certain it will almost definitely have a signature like this:
function sum(numbers);
We can use TypeScript to make things even more unambiguous, although this doesn’t change the meaning of the code:
function sum(numbers: number[]): number;
This function logically takes a collection of numbers. In JS, collections of this nature are represented as arrays. The output is the sum, which is obviously also a number.
What might this look like in C or C++? Here are some options:
int sum(const int* numbers, size_t count);
int sum(const int* numbers, size_t count) __attribute__3;
// Null terminated
uint32_t sum(const uint_32_t** numbers);
int sum(const std::vector<int>& numbers);
template <int Length>
int sum(int numbers[Length]);
template <int Length>
constexpr int sum(const int &numbers[Length]);
// Using an STL-like input iterator (see std::accumulate)
template <typename Iterator>
int sum(Iterator begin, Iterator end);
// Is this even possible? (passing values at compile time)
template<int ...values>
int sum();
Forgive me if I have some or all of these wrong. In my life, I’ve programmed much more in C++ than in JavaScript, but C++ always remains a little bit too complicated for me to remember all the subtleties.
What’s the difference between all of these options? In my mind, they are all logically the same thing. They represent a function to which you pass an “array” of numbers, and it give you the result.
The difference comes mainly down to how a client interacts with these functions at a binary level. What calling convention is used? How is the array represented in memory? What parameters or aspects of the function are available at compile time vs runtime?
If sum
was a third party library, which of these forms would it take?
We have at least one good answer to this question, since the functionality is already part of the STL in C++. It takes the following form:
template< class InputIt, class T >
T accumulate( InputIt first, InputIt last, T init );
There’s nothing surprising about this. Since we’re talking about the standard template library, you expect this to be a function template. It’s a fairly generic solution, but certainly it won’t suit everyone. Because it’s a template, you can’t take the pointer of it, you can’t pass it around as an argument to non-template code, you can’t export it directly as a library function from an object file, etc. For example, try to translate the following JavaScript code into C++:
function aggregate(useSum, arr) {
return (useSum ? sum : average)(arr);
}
The useSum
and arr value might or might not be available at compile time — a reusable piece of code shouldn’t assume one or the other.
How C++ Usually Solves This
The solution to this in C++ is normally to shunt everything to runtime when you need reusability. As a case in point, take a look at the GPIO methods that the mbed library provides. To set a single bit that represents a GPIO pin, you would use a function with the following signature:
// Set the output, specified as 0 or 1 (int)
void DigitalOut::write(int value);
This makes for fairly reusable code. But it comes at a cost. It invokes a function call4, using a runtime value. The function then needs to use indirect accesses and masks to figure out what bit to set, internally using a structure that looks like this:
typedef struct {
PinName pin;
uint32_t mask;
__IO uint32_t *reg_dir;
__IO uint32_t *reg_set;
__IO uint32_t *reg_clr;
__I uint32_t *reg_in;
} gpio_t;
The absolute cost here is a low – a function call is not a big deal, masking and indirection are quick, and passing an integer argument at runtime is not a big deal. But the relative cost is high — if value
and the specific port happened to be known at compile time, this whole thing could have been one machine instruction. Now it has to be many more, and extra memory overhead.
But surely in JavaScript this is even worse? Isn’t everything at runtime in JavaScript?
I would argue “no”. In JavaScript you don’t specify what gets done at compile time vs runtime — you leave it up to the JavaScript engine to decide. This is why I use the term “binary coupling”. In C++, your code is coupled to the binary representation of the resulting machine code — you are forced to make certain decisions, and those decisions force the C++ compiler to use certain representations. This makes C++ a glorified assembly language — it’s a way for you to write machine code in a somewhat abstracted way.
While in JavaScript, you don’t write machine code. You specify the behavior of the program in unambiguous terms, and leave it up to the engine or compiler to implement the specification in terms of machine code.
You get a small taste of this in C++ with micro-optimizations and inlining. You write a function, and in some circumstances the compiler is free to chose not to represent that as an actuall CALL
instruction in the output binary. This is an example that breaks the usual binary coupling, since the code does not dictate the binary interface. But this kind of behavior is necessarily very limited in C++.
The discussion here is not primarily about performance, it’s about reusuability. What I’m demonstrating is that in C++, it’s impossible to have a library API that doesn’t also dictate the binary ABI that is used to access that library. As a result you’re much less likely to find an existing library that serves your specific purposes, since you need to match not only the logical interface, but the binary interface as well (and the type interface, which is another story).