MetalScript Concepts: Using MetalScript in a C Project
I intend MetalScript to compile JavaScript code as either an executable or as a library. When I introduced the idea of MetalScript in a previous post, I gave a simple code example for how I imagine it to look when MetalScript compiles (and runs) a JavaScript program as an executable. In this post I’d like to present how I envisage it to look when a JavaScript program is compiled as a library, and a few juicy details about how this will work.
What do I mean by “library”?
Different people may mean different things when they say “library”. What I mean, in this context, is that the MetalScript compiler will be able to take a JavaScript file input, and produce a corresponding .o
object file or .a
static library, which can then be used by a traditional linker to merge the compiled JavaScript code into a larger compiled project.
Why would anyone want to do this?
This would certainly be an “advanced mode” use of the compiler. It brings back many of the things I hate about C/C++, such as linkers, intermediate files, etc, and so I would never recommend it as a starting point for new projects. The use case would be for people who have an existing legacy C or C++ codebase and who want to bring in a “sprinkle” of JavaScript to see how it feels, or perhaps even to bring existing NPM libraries into a C project.
What will it look like?
Let me get straight to an example. Here is one which would compile to a library that adds two numbers together. I’ve written it out intentionally verbose for expository purposes.
// mylibrary.js import * as c from 'c-ffi'; function add(a, b) { return a + b; } c.exportFunction('foo', // Exporting a function with the public name "foo" add, // The function we want to export { ret: c.Int32, // The return type args: [c.Int32, c.Int32] // The argument types } ));
And then to compile, you would execute something like the following command line:
metalscript --library -o mylibrary.a mylibrary.js
To use it, you might link it alongside the following C code:
#include <stdio.h> #include <stdint.h> extern int32_t foo(int32_t a, int32_t b); int main() { int c = foo(1, 2); printf("The result is: %i\n", c); return 0; }
Let’s break it down…
First, I consider that there will be a built-in MetalScript module which here I’ve called c-ffi
(short for “C foreign function interface”), which provides the helpers and IO functions that are required to create library definitions allowing C to interface with our library. Think of how node has builtin modules for accessing the “outside world”, such as the fs
module for accessing the file-system. The c-ffi
module is providing analogous access to the outside world, except that here, the “outside world” is the surrounding C code in the larger project (for which MetalScript is not responsible).
c.exportFunction(name, fn, signature)
The c.exportFunction
function1 creates a new definition in the library symbol table — i.e. it creates a new symbol with external linkage. This symbol needs to have a name and in the above example I’ve given it the name 'foo'
. I’ve called it foo
and not add
, just to make a point: the name in the symbol table is unrelated to the JavaScript idea of a function name. To drive this point home further, consider that functions in JavaScript don’t need to have names at all, so the example could have been written like this (notice the lambda definition):
import * as c from 'c-ffi'; c.exportFunction('foo', (a, b) => a + b, { ret: c.Int32, args: [c.Int32, c.Int32] });
I don’t understand. Does the compiler recognize
c.exportFunction
as special syntax, like C’sextern
keyword?
No, the variable c
here is a JavaScript object and c.exportFunction
is a property on that object which is a function, just like any other JavaScript method. There is nothing special about the syntax c.exportFunction
. You could have written c["ex"+"port"+"Function"]
instead of c.exportFunction
and it would have worked exactly the same.
What may be confusing to you, is that we have what appears to be a mix of compile time and runtime code in the same file2 — the c.exportFunction
statement executes at compile time, while the a + b
expression executes at runtime.
Recall from my initial post on MetalScript that this is one of the key features of MetalScript — the program transitions between two phases, initially executing in the compiler and then moving to execute on the target device. Normally, when MetalScript is in compiling a whole program (not just a library), the transition from compile time to runtime happens when the program calls mcu.start
, and the entry point to the runtime program is the continuation of where the running process left off when it called mcu.start
. In the case of a library however, the programmer has chosen to go the “advanced route” and implement the global entry point (e.g. the reset interrupt vector) in another language, so the JavaScript code won’t have an mcu.start
statement. What this means is that the whole script executes at compile time to completion, producing the export definitions as an artifact (as a result of calling c.exportFunction
), and then the compiler uses the exported definitions as the entry points to the static library that it produces.
If you are used to C or C++, this might make your hackles stand on end. You might think that this has to use some ugly, hacky, complicated and fragile heuristics to figure out what’s compile time and what’s runtime, and to separate these “magically”, until it doesn’t work and everything breaks3.
But I’m here to assure you that I’ve gone into sufficient detail in the MetalScript design to know it won’t need to resort to such hackery. The code you write will all be interpreted according to the ECMAScript standard.
Here’s an example that may help put your mind at ease. Let’s say that instead of creating a library with one function foo
that adds two numbers together, let’s say we want to create a library with 10 functions that add the numbers 1 to 10 to a given argument respectively, as represented by the following C code:
extern int add1(int x); // returns x + 1 extern int add2(int x); // returns x + 2 extern int add3(int x); // returns x + 3 extern int add4(int x); // returns x + 4 extern int add5(int x); // returns x + 5 extern int add6(int x); // returns x + 6 extern int add7(int x); // returns x + 7 extern int add8(int x); // returns x + 8 extern int add9(int x); // returns x + 9 extern int add10(int x); // returns x + 10
We can create such a library with the following JavaScript, which again I’ve written verbosely for illustration:
import * as c from 'c-ffi'; function createAddN(n) { function addN(x) { return x + n; } return addN; } const exportSignature = { ret: c.Int32, args: [c.Int32] }; for (let i = 1; i <= 10; i++) { const exportName = `add${i}`; const implementation = createAddN(i); c.exportFunction(exportName, implementation, exportSignature); }
Some key points I’d like to highlight about this example:
- The call to
c.exportFunction
only occurs once in this code (lexically), but because it is executed 10 times, there are 10 exports in the library. It is not some mystical static analysis that is producing these exports, but a real JavaScript engine executing code according to the defined standard. - The names in the symbol table in this example are dynamically computed.
- You may notice that the program essentially exports
addN
10 times, and each time it has a different behavior (adds a different number). This is because, in JavaScript, lexical functions (the code that makes up your function) are actually like function “templates”, which are instantiated every time the surrounding code is executed. SincecreateAddN
is called 10 times, there are actually 10 distinct function instances of theaddN
function, each with a different closure, and each of these can be exported separately4. - A more subtle point is that the value for the parameter
n
increateAddN
is part of a closure that is carried over from compile time to runtime. Again, this is perfectly valid JavaScript code, and MetalScript doesn’t use hackery or trickery, it will work just fine.
If you aren’t yet convinced that this is a good idea, let me give you one more example. Let’s say that for debug purposes, you wanted to log any calls to exported functions. You could add the following at the beginning of your JavaScript program:
// Replace c.export with our own definition that logs incoming calls const actualExport = c.exportFunction; c.exportFunction = function (name, fn, signature) { function wrapper(...args) { console.log(`Calling function ${name} with arguments ${args}`) return fn(...args) } actualExport(name, wrapper, signature); }
This snippet of code wraps the c.exportFunction
function with our own implementation. Each time our own exportFunction
function is called, it creates a wrapper function and exports that wrapper function instead. The wrapper function calls the original function, but it also logs the call to the console so we can see it happening.
If this doesn’t blow your mind, maybe it’s because you’re coming from “JavaScript land” and think that this example is obvious because it’s just vanilla JavaScript.
Does this mean I can export at runtime?
No (not surprisingly).
You may have thought that since c.exportFunction
is a function, and functions can be called at runtime, that we can therefore call c.exportFunction
at runtime to create new export definitions while the program is running — but this isn’t the case. Or maybe you thought that since this couldn’t work at runtime (because you know that libraries can’t change after they’ve been compiled), that the whole idea is invalid, but that isn’t true either.
The distinction between compile time and runtime is not invisible to the user code (nor should it be). The MetalScript APIs have different behavior depending on what phase the program is in, as one would expect. For example, the c.exportfunction
will only work at compile time (the call will throw an exception at runtime), and reading GPIO will only be meaningful at runtime.
This doesn’t violate the spec in any way — it’s perfectly valid for the behavior of APIs to change as the state of the world outside changes. For example, in node I can open a file once and it might work fine, and then delete the file (i.e. change the world state, either through code or by hand), and then the next time I try to open the file, the API will give an error.
In MetalScript, you can imagine that your program is running inside a glass teleportation capsule, and there is a button in the capsule that allows you to teleport from “compile-time universe” to “runtime universe” (the button is mcu.start
). There is also a pending job in a lowest-priority JavaScript job queue which will perform the teleportation automatically if your program never presses the button (i.e. your running program will be teleported automatically when all the scripts run to completion). The things you can see out the window of the capsule change at the moment of teleportation — the scaffolding around the launch pad of the capsule in compile-time-world vanish, and you find yourself inside the living beast of the MCU5:
- The module resolver disappears, so
require
statements can no longer be used to load dependencies (you need to get all your dependencies “on board” before launching the capsule) - The
c-ffi
export definitions are now cast in stone (or flash to be more accurate). You can no longer add or modify export definitions. - Similarly, the memory layout, choice of GC algorithm, etc. all become fixed at the point of launch. In the teleportation analogy, you can think of this as choosing your life support for the journey before you leave.
- The heartbeat of the beast comes to life — timers start ticking, and GPIO becomes available.
- The outside world can start talking to you — functions that you previously exported may now start receiving calls6.
The function signature
We’ve talked about how the symbols are registered in the library export symbol table, but we haven’t yet talked about how the “signature” of the function works.
It’s quite simple: the signature provided to c.exportFunction
specifies the binary interface through which C functions (or any external code) can call the JavaScript function. I like to think of it like the following diagram. There is “JavaScript land”, made up of your library code and its imported dependencies (or builtin-dependencies), and there is “C Land”, which is all the other code in the system which is not under the control of MetalScript7.
Calls from C into JavaScript are done by declaring the extern
definition in C, which the C compiler uses to understand the signature of the call (how to pass arguments etc), and provides information to the linker about the name of the thing that is called. On the JavaScript side of the fence, the export information given to c.exportFunction
is used by MetalScript to implement a stub that receives the arguments from C-Land, performs any parameter conversions, and forwards the call to the given JavaScript function.
The types that are provided in the signature, such as c.Int32
dictate the semantics of the conversions/marshaling between C and JavaScript. The Int32
argument type is one that is received from C as an int32_t
and promoted to a first class JavaScript number
value. In the opposite direction, when the Int32
result is sent back to C land, the dynamically-typed JavaScript value is coerced to an int32_t
using familiar JavaScript rules, such as those used when writing to an Int32Array
.
What it doesn’t look like
I’d like to conclude this article by highlighting what the MetalScript-way-of-doing-things doesn’t look like:
- It doesn’t require a manifest file or configuration files to describe the exports
- It doesn’t require many lines of hand-written boilerplate C code to perform conversions
- It doesn’t require any special C macros or classes, such as in native abstractions for node (NAN)
- It doesn’t force you to write pseudo-dynamically-typed C code, where you need special C function calls to query the type of a JavaScript value
- It doesn’t require any language extensions to JavaScript to handle binary types or export definitions
- It doesn’t require you to have an understanding of some virtual machine that the JavaScript code is running on (because there isn’t one)
If you’ve never tried to interface between a JavaScript script and native code (e.g. in node or espruino or something else), then perhaps the above points don’t make sense to you. You may think, “why would you need special macros or language extensions, or reams of boilerplate code?” And to that I say, “I agree”.
P.S. There are lots of details I haven’t covered here. If there’s something specific that you find confusing or disagree with, feel free to leave a comment below, or drop me an email.
Obviously the names and structures may change. ↩
If it wasn’t confusing you, perhaps it is now that I’ve brought your attention to it. ↩
For example, I believe that webpack uses what I would call “hacky” processes to find “require” statements statically, and so it would be tricked by any number of legal JavaScript codes, such as if
require
were aliased with a different name ↩You could also export the same function instance 10 times if you wanted to, for example if you wanted to export the same JavaScript behavior with different C signatures, such as exporting an
Int32
variant and aDouble
variant of the same function ↩If you are compiling a MetalScript stand-alone program, then you will be lucky enough to be inside “the beast” of the MCU. If you are compiling as a library, then “the beast” will be a foreign C program, which is far more vicious and its insides are toxic — radioactive with stray pointers, and tumorous growths of leaked memory. ↩
And interrupts may start firing, which is a topic for another article ↩
To tie this back to the teleportation capsule analogy, the outer circle in the diagram is the capsule, and what I’m calling “C Land” is the outside world after teleportation. The export definition is part of the “capsule window” through which we can interact with the outside world ↩
One Reply to “MetalScript Concepts: Using MetalScript in a C Project”