Author: Michael Hunter

MetalScript Progress – July 2018

MetalScript Progress – July 2018

For those following along, here’s an update on my progress on MetalScript in the last month.

It’s been a bit of a slow month for MetalScript, just because I have a lot going on and MetalScript takes a bit of a backseat to other important things. I expect things to stay like this for a while longer.

Website

I spent a weekend hacking together a stub of a website. See http://metalscript.com. It’s really a placeholder for a website, not a real website. I spent a morning giving myself a crash course in Hugo and Slate for the first time, as well as trying to remember how to use Blender to create the somewhat rudimentary graphic depicting “JavaScript on a Microcontroller”. And lots of CSS hacking.

The website is not meant to attract customers, because of course there is no product yet. It’s meant to be a representation of the grand vision of MetalScript. It paints the picture of what I’m trying to get to, and so helps me stay focused on the goal. It’s also a place where I can start collecting user documentation so that when I launch the MVP there will already be documentation in place, and the documentation doubles as a spec of what it needs to do.

I didn’t spend much time on it, and it leaves lots to be desired. The markdown at the bottom of the home page is particularly jarring, and the whole thing is not very mobile friendly. But I’ll improve it over time.

I also couldn’t make up my mind on the color scheme, and landed up changing it a number of times. Originally I was thinking pink/orange (excuse my primitive “engineer’s vocabulary” for colors) because I think it has really awesome energy to it, and paints the picture of something new and exciting and different…

But when I was playing around with the MCU graphic, I started with yellow instead because it seemed to be “the JavaScript color” (if you Google images for JavaScript).

But quite frankly, yellow is ugly. Sorry to those who like it. It’s such a stark, flat color, and doesn’t say anything I want it to for MetalScript.

I went through a number of greenish shades which I dismissed. And while it’s probably not final, the website is currently based around an aqua blue.

Not everything matches, because I kept changing my mind about exactly where on the spectrum between blue and green I wanted to be.

Blue is nice because it’s calming but professional. This is very much what I intend MetalScript to be — a tool for professionals, but one without the discomfort and complexity associated with C and C++.

I also spent embarrassingly long on the front image. I was trying to convey the idea of putting JavaScript “onto” or “into” a microcontroller, but was worried about misrepresenting what I am doing as if I am selling a microcontroller with a JS brand (I can’t assume that anyone visiting my website has prior expectations about what MetalScript is). My solution in the end was to create a somewhat unrealistic and stylized representation of an MCU, to try help get that idea across that the MCU in the picture is not a real thing I’m trying to sell.

Just because it’s unrealistic and stylized doesn’t mean it’s not a challenge for a Blender amateur like me. There are lots of details I tried to get right that a professional may have achieved better results in a lot less time. Take for example this comparison between an earlier draft (left) and a later one (right)…

(Click for a larger view). I wanted to convey the idea of JavaScript “radiating” out of the microcontroller, as if the touch of MetalScript imbued it with superpowers. In the image on the left, the JS logo is actually radiating, but radiation (light emission) is pretty boring-looking in reality. On the right, I decided to just “pretend” to radiate by actually constructing emissive volumetric rays floating above the “JS”. This was much harder than I thought, such as trying to provide an alpha (transparency) gradient from the bottom to the top so that the rays appear to just “fade out” gracefully as they move further from the apparent source, and not abruptly end when the volume ends. Half the time while doing this, I landed up having anti-emisive rays, if that’s a real term, that sucked in blue light (looking red-ish) — a side effect of not providing the correct origin point or 3D orientation for the gradient volumetric texture.

(Other differences to note are the corners, the color, and the reflectivity of the surface).

By no means is anything final. I will likely change my mind several more times before MVP, and anyway, as soon as there is money I will probably get this all looked at by someone with better skills than mine.

Technical

But to the technical stuff I’ve been doing in the last month…

IL Compiler

I’ve added the necessary features to my IL compiler (compiles JS to IL) to support the following traditional example:

// blinky.js
MCU.start();
setInterval(() => gpio.pin('B4').toggle(), 500);

This is the hello-world of microcontrollers, to create a blinking LED. Of course, nothing yet blinks since there is a lot of work left to do on the compiler before I have actual machine code running on the device. But nevertheless it demonstrates a number of key features that the IL compiler needs, such as the ability to instantiate closures, call functions, access properties, etc.

Virtual Machine

Probably the biggest piece of work done in the last month is getting the virtual machine up and running.

It’s been tested printing “Hello, world!”, which is an important milestone. You might think that “Hello, world!” is a stupidly simple example, but actually the virtual machine needs to execute hundreds of IL instructions to achieve that simple task, most of which is setting up the realm (i.e. the global builtin “stuff” that JavaScript needs).

Perhaps more interestingly, I have the above blinky example running on the virtual machine. I say “more interesting” because the blinky example demonstrates another key feature: suspending the virtual machine before the next stage of compilation. So what I have currently working is taking a program (blinky) that runs in the VM until the suspension point, and then the suspended machine is serialized as IL, including registers, stack, and heap allocations (the virtual heap includes functions). That’s a pretty cool point to be at!

Re-reading this, I think I should clarify what I mean by “virtual machine”. I’ve previously made it clear that MetalScript programs don’t run on a virtual machine, so why is there a virtual machine? The reason is simply because in the two-phase execution model of MetalScript, where a program is executed at both compile time and runtime — it is the compile-time execution that is done on a virtual machine. The runtime execution is bare-metal.

What’s next?

The next stage is a post-processing step that will be executed on the suspended virtual machine state, that implements the behavior of “resuming” the JavaScript job that was suspended when MCU.start was called. This post-processing step is somewhat like a continuation passing style transformation. All the functions that were suspended in the call stack will be turned “inside out” so that their entry point is where they left off when the virtual machine was suspended.

Believe it or not, the VM stack is 5 levels deep at the point where the blinky example is suspended, even though the call is done from the root scope of the script. This is yet another illustration of the fact that “it’s more complicated than it looks”.

The result of this step will be a body of IL that does not need to contain any definition for the current register states or call stack, which will be much easier for the next phase of the compiler to deal with.

I’m also finalizing the design of what I call the “symbolic interpreter” which essentially will step through the IL to figure out what it does (and thus how to implement it in machine code). This is the most complicated and critical piece of the whole project. As I’ve said in previous posts, through the side experiment I called “MiniLanguage” I’ve gone far enough with this idea to feel confident of the direction I’m going, but MetalScript is an order of magnitude more complicated than MiniLanguage so it’s going to take some time to iron out the details.

I’ve also spent some time thinking about how setInterval is going to work. There are any number of possible approaches I could take, but I’ve decided that the quickest path for the moment will be to implement it as “special” behavior that is part of the bare-minimum runtime library. Previously I said that the runtime library would only have the GC and event loop, but now I think that timers are also going to benefit from being built-in in this way. Among other things, this means I can postpone thoughts about implementing interrupts in pure JavaScript (which is completely plausible, but I don’t want to bloat the MVP with features that are not critical to using MetalScript, when there are perfectly reasonable pragmatic alternatives).

Looking further into the future, my plan is to keep pushing the blinky example through each phase of the compiler until I eventually get it out the other end and have an actual blinking LED. This would be a point of massive celebration.

Progress on MetalScript and random thoughts

Progress on MetalScript and random thoughts

This has been a good week for MetalScript.

This post is more of an informal ramble. Read it if you extra time on your hands, but I don’t think I’m going to say anything profound here. Also read this if you’re doing a product competing with MetalScript, since I’m spilling some of the implementation here ;-)

A few weeks or months ago1, I switched over from MiniLanuage to MetalScript. MiniLanguage is a new language I started about 9 months ago to test-drive a number of the MetalScript principles in a simplified context. It’s in MiniLanguage that I’ve created a working end-to-end pipeline from source text to output binary. But rather than perfect MiniLanguage with all the bells and whistles, I wanted to move back to the real project.

Breakthrough — bridging compile time and runtime

I had a breakthrough this week in terms of structuring code that implements the spec in a way that splits between runtime and compile time. In particular, the ParseScript operation (and some of the surrounding code) has been a bit of a pain in the butt — the behavior of the function according to the spec is to return a Script Record, which contains both a reference to the realm2, and also a reference to the script code.

The script code is purely a compile-time construct, while the realm is purely a runtime construct. So ParseScript is kinda split between runtime and compile time. It’s really awkward to consolidate code that is split in this way, while still trying to make it maintainable. And I’ve been bashing my head against a wall for a while on the right way to do it.

Something finally clicked this week, and I found a way to have both runtime behavior and compile time behavior defined in the same lines of code. The way it works is essentially that my implementation of ParseScript returns a monad, that contains both the compile time component of the return value, as well as the sequence of runtime IL operations required to get the runtime component of the return value. The caller can immediately use the compile time component, and then is obliged to also emit code that invokes the runtime component.

For simplicity, I opted to implement the monad as just a tuple, as described by the MixedPhaseResult type below.

/** Used as the result of an operation that has both a compile time and runtime
 * component. The compile time component of the operation is the function
 * itself, and its result is the first element in the returned tuple. The
 * runtime component is represented as an IL function that is the second part of
 * the tuple. */
type MixedPhaseResult<T> = [T, il.ILFunction];

...

// https://tc39.github.io/ecma262/#sec-parse-script
function parseScript(unit: il.Unit, sourceText: string): MixedPhaseResult<ParseScriptResult> {
  ...
}

...
  const [parseResult, parseScriptRT] = parseScript(unit, sourceText);
  const scriptRecord = code.op('il_call', parseScriptRT, realm, hostDefined);

(Side note: Yes, of course MetalScript is being written in TypeScript, and the plan is to make it self-hosting so I can compile MetalScript with MetalScript in order to distribute an efficient binary executable of the compiler).

The beauty of this approach is that it allows me to have a single parseScript function in my implementation, which as you can see above has an embedded comment that references the exact location in the spec, and the full behavior of the corresponding piece of the spec is fully encapsulated in the body of parseScript. This one-to-one relationship between spec functions and implementation functions is going to be super useful from a maintenance perspective — keeping up to date with the latest spec — which as stated in a previous post matches one of my goals with MetalScript.

I’ve used this technique in a number of other places that bridge the gap between compile time and runtime, and I think the result is beautiful.

MetalScript Unit Tests

I’ve also started writing unit tests for MetalScript. Previously I avoided unit tests because there was too much uncertainty and I landed up completely changing my mind on things too often to make unit tests a useful addition. Now after spending 9 months in MiniLanguage, I feel I’m reaching a point of stability in the underlying concepts and have started adding unit tests.

The first unit tests I have working use the above parseScript and related functions to translate a source text to IL. And as of today I have this working for 2 simple test cases — an empty script and a “Hello, World” script.

I can’t show you the IL itself, because it would give away too many of the internals. Maybe when I’m further along in the project and there is less risk of having my ideas stolen, I will give up more details. But believe me when I say the IL is a thing of beauty.

The hello-world script translates to 62 lines of IL (including some whitespace), which is a lot, and emphasizes how many operations are actually required to perform simple tasks in JavaScript, and how much of an accomplishment it is to get to this point. Bear in mind that this IL language is designed by me with the intention of compiling easily, not to be a compact representation of the program, since the IL will never get to the target device.

Personal Note: Be comfortable with your work

A personal lesson I’m learning with this and other projects, is to do what it takes to feel comfortable with what you’ve done. In MetalScript, it’s a constant battle in my mind as to whether I should cut corners to save time and get to a POC quickly, or whether I should take it slow and make sure that every piece is as simple, understandable, reliable, and maintainable as possible.

There are arguments for both at different stages of a project, but if you plan on the project becoming something big, then I really believe you need to do what it takes to feel emotionally comfortable with what you’ve done. The reason is that when you leave a piece of code, and in the back of your mind you think of it as hacky, fragile, or overly complicated, and when you wrote it you just had to pray that it worked, then you aren’t going to want to go back to it, and you will become generally demotivated by your work. But if you leave a project or piece of code feeling comfortable about it, then it will be much easier to go back “home” to it in future.

So when I say that you should spend time on your work until you feel comfortable with it, I’m not talking about spending time making the most advanced piece of code you can be proud of, that has a gazillian features and can do backflips and handstands and handle a bunch of different use cases. I’m talking about thinking really carefully about how to remove complexity from your code and distill it down to its bare essence. You want to use your superpowers to remove complexity, not to handle it. Understanding a complicated design is only the first step; reducing it to a simple design is the end goal.

If your code is clean, simple, and has a good readme and guiding comments to help newcomers get into it, then you will feel more comfortable when you are the newcomer getting back into it after some time.

 


  1. Time is a blur 

  2. The realm is a collection of builtin objects such as Array and Object 

MetalScript Concepts: Using MetalScript in a C Project

MetalScript Concepts: Using MetalScript in a C Project

I intend MetalScript to compile JavaScript code as either an executable or as a library. When I introduced the idea of MetalScript in a previous post, I gave a simple code example for how I imagine it to look when MetalScript compiles (and runs) a JavaScript program as an executable. In this post I’d like to present how I envisage it to look when a JavaScript program is compiled as a library, and a few juicy details about how this will work.

What do I mean by “library”?

Different people may mean different things when they say “library”. What I mean, in this context, is that the MetalScript compiler will be able to take a JavaScript file input, and produce a corresponding .o object file or .a static library, which can then be used by a traditional linker to merge the compiled JavaScript code into a larger compiled project.

Why would anyone want to do this?

This would certainly be an “advanced mode” use of the compiler. It brings back many of the things I hate about C/C++, such as linkers, intermediate files, etc, and so I would never recommend it as a starting point for new projects. The use case would be for people who have an existing legacy C or C++ codebase and who want to bring in a “sprinkle” of JavaScript to see how it feels, or perhaps even to bring existing NPM libraries into a C project.

What will it look like?

Let me get straight to an example. Here is one which would compile to a library that adds two numbers together. I’ve written it out intentionally verbose for expository purposes.

// mylibrary.js
import * as c from 'c-ffi';

function add(a, b) {
  return a + b;
}

c.exportFunction('foo',       // Exporting a function with the public name "foo"
  add,                        // The function we want to export
  { 
    ret: c.Int32,             // The return type
    args: [c.Int32, c.Int32]  // The argument types
  }
));

And then to compile, you would execute something like the following command line:

metalscript --library -o mylibrary.a mylibrary.js

To use it, you might link it alongside the following C code:

#include <stdio.h>
#include <stdint.h>

extern int32_t foo(int32_t a, int32_t b);

int main() {
  int c = foo(1, 2);
  printf("The result is: %i\n", c);
  return 0;
}

Let’s break it down…

First, I consider that there will be a built-in MetalScript module which here I’ve called c-ffi (short for “C foreign function interface”), which provides the helpers and IO functions that are required to create library definitions allowing C to interface with our library. Think of how node has builtin modules for accessing the “outside world”, such as the fs module for accessing the file-system. The c-ffi module is providing analogous access to the outside world, except that here, the “outside world” is the surrounding C code in the larger project (for which MetalScript is not responsible).

c.exportFunction(name, fn, signature)

The c.exportFunction function1 creates a new definition in the library symbol table — i.e. it creates a new symbol with external linkage. This symbol needs to have a name and in the above example I’ve given it the name 'foo'. I’ve called it foo and not add, just to make a point: the name in the symbol table is unrelated to the JavaScript idea of a function name. To drive this point home further, consider that functions in JavaScript don’t need to have names at all, so the example could have been written like this (notice the lambda definition):

import * as c from 'c-ffi';
c.exportFunction('foo', (a, b) => a + b, { ret: c.Int32, args: [c.Int32, c.Int32] });

I don’t understand. Does the compiler recognize c.exportFunction as special syntax, like C’s extern keyword?

No, the variable c here is a JavaScript object and c.exportFunction is a property on that object which is a function, just like any other JavaScript method. There is nothing special about the syntax c.exportFunction. You could have written c["ex"+"port"+"Function"] instead of c.exportFunction and it would have worked exactly the same.

What may be confusing to you, is that we have what appears to be a mix of compile time and runtime code in the same file2 —  the c.exportFunction statement executes at compile time, while the a + b expression executes at runtime.

Recall from my initial post on MetalScript that this is one of the key features of MetalScript — the program transitions between two phases, initially executing in the compiler and then moving to execute on the target device. Normally, when MetalScript is in compiling a whole program (not just a library), the transition from compile time to runtime happens when the program calls mcu.start, and the entry point to the runtime program is the continuation of where the running process left off when it called mcu.start. In the case of a library however, the programmer has chosen to go the “advanced route” and implement the global entry point (e.g. the reset interrupt vector) in another language, so the JavaScript code won’t have an mcu.start statement. What this means is that the whole script executes at compile time to completion, producing the export definitions as an artifact (as a result of calling c.exportFunction), and then the compiler uses the exported definitions as the entry points to the static library that it produces.

If you are used to C or C++, this might make your hackles stand on end. You might think that this has to use some ugly, hacky, complicated and fragile heuristics to figure out what’s compile time and what’s runtime, and to separate these “magically”, until it doesn’t work and everything breaks3.

But I’m here to assure you that I’ve gone into sufficient detail in the MetalScript design to know it won’t need to resort to such hackery. The code you write will all be interpreted according to the ECMAScript standard.

Here’s an example that may help put your mind at ease. Let’s say that instead of creating a library with one function foo that adds two numbers together, let’s say we want to create a library with 10 functions that add the numbers 1 to 10 to a given argument respectively, as represented by the following C code:

extern int add1(int x); // returns x + 1
extern int add2(int x); // returns x + 2
extern int add3(int x); // returns x + 3
extern int add4(int x); // returns x + 4
extern int add5(int x); // returns x + 5
extern int add6(int x); // returns x + 6
extern int add7(int x); // returns x + 7
extern int add8(int x); // returns x + 8
extern int add9(int x); // returns x + 9
extern int add10(int x); // returns x + 10

We can create such a library with the following JavaScript, which again I’ve written verbosely for illustration:

import * as c from 'c-ffi';

function createAddN(n) {
  function addN(x) {
    return x + n;
  }
  return addN;
}

const exportSignature = { ret: c.Int32, args: [c.Int32] };

for (let i = 1; i <= 10; i++) {
  const exportName = `add${i}`;
  const implementation = createAddN(i);
  c.exportFunction(exportName, implementation, exportSignature);
}

Some key points I’d like to highlight about this example:

  • The call to c.exportFunction only occurs once in this code (lexically), but because it is executed 10 times, there are 10 exports in the library. It is not some mystical static analysis that is producing these exports, but a real JavaScript engine executing code according to the defined standard.
  • The names in the symbol table in this example are dynamically computed.
  • You may notice that the program essentially exports addN 10 times, and each time it has a different behavior (adds a different number). This is because, in JavaScript, lexical functions (the code that makes up your function) are actually like function “templates”, which are instantiated every time the surrounding code is executed. Since createAddN is called 10 times, there are actually 10 distinct function instances of the addN function, each with a different closure, and each of these can be exported separately4.
  • A more subtle point is that the value for the parameter n in createAddN is part of a closure that is carried over from compile time to runtime. Again, this is perfectly valid JavaScript code, and MetalScript doesn’t use hackery or trickery, it will work just fine.

If you aren’t yet convinced that this is a good idea, let me give you one more example. Let’s say that for debug purposes, you wanted to log any calls to exported functions. You could add the following at the beginning of your JavaScript program:

// Replace c.export with our own definition that logs incoming calls
const actualExport = c.exportFunction;
c.exportFunction = function (name, fn, signature) {
  function wrapper(...args) {
    console.log(`Calling function ${name} with arguments ${args}`)
    return fn(...args)
  }
  actualExport(name, wrapper, signature);
}

This snippet of code wraps the c.exportFunction function with our own implementation. Each time our own exportFunction function is called, it creates a wrapper function and exports that wrapper function instead. The wrapper function calls the original function, but it also logs the call to the console so we can see it happening.

If this doesn’t blow your mind, maybe it’s because you’re coming from “JavaScript land” and think that this example is obvious because it’s just vanilla JavaScript.

Does this mean I can export at runtime?

No (not surprisingly).

You may have thought that since c.exportFunction is a function, and functions can be called at runtime, that we can therefore call c.exportFunction at runtime to create new export definitions while the program is running — but this isn’t the case. Or maybe you thought that since this couldn’t work at runtime (because you know that libraries can’t change after they’ve been compiled), that the whole idea is invalid, but that isn’t true either.

The distinction between compile time and runtime is not invisible to the user code (nor should it be). The MetalScript APIs have different behavior depending on what phase the program is in, as one would expect. For example, the c.exportfunction will only work at compile time (the call will throw an exception at runtime), and reading GPIO will only be meaningful at runtime.

This doesn’t violate the spec in any way — it’s perfectly valid for the behavior of APIs to change as the state of the world outside changes. For example, in node I can open a file once and it might work fine, and then delete the file (i.e. change the world state, either through code or by hand), and then the next time I try to open the file, the API will give an error.

In MetalScript, you can imagine that your program is running inside a glass teleportation capsule, and there is a button in the capsule that allows you to teleport from “compile-time universe” to “runtime universe” (the button is mcu.start). There is also a pending job in a lowest-priority JavaScript job queue which will perform the teleportation automatically if your program never presses the button (i.e. your running program will be teleported automatically when all the scripts run to completion). The things you can see out the window of the capsule change at the moment of teleportation — the scaffolding around the launch pad of the capsule in compile-time-world vanish, and you find yourself inside the living beast of the MCU5:

  • The module resolver disappears, so require statements can no longer be used to load dependencies (you need to get all your dependencies “on board” before launching the capsule)
  • The c-ffi export definitions are now cast in stone (or flash to be more accurate). You can no longer add or modify export definitions.
  • Similarly, the memory layout, choice of GC algorithm, etc. all become fixed at the point of launch. In the teleportation analogy, you can think of this as choosing your life support for the journey before you leave.
  • The heartbeat of the beast comes to life — timers start ticking, and GPIO becomes available.
  • The outside world can start talking to you — functions that you previously exported may now start receiving calls6.

The function signature

We’ve talked about how the symbols are registered in the library export symbol table, but we haven’t yet talked about how the “signature” of the function works.

It’s quite simple: the signature provided to c.exportFunction specifies the binary interface through which C functions (or any external code) can call the JavaScript function. I like to think of it like the following diagram. There is “JavaScript land”, made up of your library code and its imported dependencies (or builtin-dependencies), and there is “C Land”, which is all the other code in the system which is not under the control of MetalScript7.

Calls from C into JavaScript are done by declaring the extern definition in C, which the C compiler uses to understand the signature of the call (how to pass arguments etc), and provides information to the linker about the name of the thing that is called. On the JavaScript side of the fence, the export information given to c.exportFunction is used by MetalScript to implement a stub that receives the arguments from C-Land, performs any parameter conversions, and forwards the call to the given JavaScript function.

The types that are provided in the signature, such as c.Int32 dictate the semantics of the conversions/marshaling between C and JavaScript. The Int32 argument type is one that is received from C as an int32_t and promoted to a first class JavaScript number value. In the opposite direction, when the Int32 result is sent back to C land, the dynamically-typed JavaScript value is coerced to an int32_t using familiar JavaScript rules, such as those used when writing to an Int32Array.

What it doesn’t look like

I’d like to conclude this article by highlighting what the MetalScript-way-of-doing-things doesn’t look like:

  • It doesn’t require a manifest file or configuration files to describe the exports
  • It doesn’t require many lines of hand-written boilerplate C code to perform conversions
  • It doesn’t require any special C macros or classes, such as in native abstractions for node (NAN)
  • It doesn’t force you to write pseudo-dynamically-typed C code, where you need special C function calls to query the type of a JavaScript value
  • It doesn’t require any language extensions to JavaScript to handle binary types or export definitions
  • It doesn’t require you to have an understanding of some virtual machine that the JavaScript code is running on (because there isn’t one)

If you’ve never tried to interface between a JavaScript script and native code (e.g. in node or espruino or something else), then perhaps the above points don’t make sense to you. You may think, “why would you need special macros or language extensions, or reams of boilerplate code?” And to that I say, “I agree”.

P.S. There are lots of details I haven’t covered here. If there’s something specific that you find confusing or disagree with, feel free to leave a comment below, or drop me an email.


  1. Obviously the names and structures may change. 

  2. If it wasn’t confusing you, perhaps it is now that I’ve brought your attention to it. 

  3. For example, I believe that webpack uses what I would call “hacky” processes to find “require” statements statically, and so it would be tricked by any number of legal JavaScript codes, such as if require were aliased with a different name 

  4. You could also export the same function instance 10 times if you wanted to, for example if you wanted to export the same JavaScript behavior with different C signatures, such as exporting an Int32 variant and a Double variant of the same function 

  5. If you are compiling a MetalScript stand-alone program, then you will be lucky enough to be inside “the beast” of the MCU. If you are compiling as a library, then “the beast” will be a foreign C program, which is far more vicious and its insides are toxic — radioactive with stray pointers, and tumorous growths of leaked memory. 

  6. And interrupts may start firing, which is a topic for another article 

  7. To tie this back to the teleportation capsule analogy, the outer circle in the diagram is the capsule, and what I’m calling “C Land” is the outside world after teleportation. The export definition is part of the “capsule window” through which we can interact with the outside world 

Why not Espruino?

Why not Espruino?

I recently posted about MetalScript, a JavaScript compiler I’m creating to allow people to write firmware in JavaScript. One of the responses I received is that there already exists a solution for running JavaScript on a microcontroller: Espruino. Why am I creating something new?

TL;DR: Espruino uses an interpreter, which takes up memory, runs slowly, and ES conformance is sacrificed for speed. MetalScript is a compiler, so all the memory is available to the application, which will run quickly, and there is no need to sacrifice conformance for speed.

For those who don’t know, Espruino is a JavaScript interpreter that is designed to run on an MCU. This means that it can take JavaScript source code (either on the serial port as you type it, or stored as text in MCU flash), and the interpreter running on the MCU itself will execute it to produce the desired behavior. Espruino actually sell a joint solution which includes both the hardware and the interpreter running on it, but in this article I will be focusing only on the interpreter.

Let me start by saying that I think Espruino is great at what it does. It really does achieve a lot in a comparatively small code footprint. Espruino apparently uses 100-200kB of flash (ROM), plus whatever size you need for storing code, and apparently it can operate in under 8kB of RAM. This is pretty impressive, and I commend Gordon and the Espruino team for their ability to pull this off.

But having said that, I’ve considered Espruino for two different projects in the past and chosen not to use it for either of them, for reasons which I’ll unpack here. The one project was an IoT smart parking meter, and would have used Espruino as a scripting engine for user workflows. In this scenario, there was a mature existing codebase of firmware which we had no intention of changing, but we wanted to augment it with the ability to run simple scripts for customization purposes.

The other project was an IoT gateway device for a sensor network. We had existing hardware, but the existing firmware code was in a serious state of disrepair and we were looking at rewriting it from the ground-up. Here I was considering the possibility of JavaScript being the primary language for the firmware.

Let me make it clear that this is not a comparison between tools, but rather about why Espruino did not address my specific needs, and how I intend to make a tool that would have addressed those needs, and so might help others with similar needs in future.

VM Size

In both projects where I considered Espruino, we were limited by the amount of RAM and ROM available. But particularly in the parking meter solution, the size and memory overhead of the VM was a concern. While its very impressive that Espruino only takes 8kB of RAM and 100kB of ROM, this would still have consumed most of the resources on both devices, leaving very little for anything else.

Particularly on the smart parking meter, it would have been completely disproportionate to have a single feature consume this level of resources, bearing in mind that we only needed to run one or two short scripts to manage the flow between different screens.

How will MetalScript address this?

MetalScript will compile code to run bare-metal, meaning that it doesn’t require any supporting runtime infrastructure such as a VM or interpreter. This means that like in C, an empty JavaScript program will consume almost no RAM or ROM — probably in the order of 1 kB of ROM and 100 B of RAM, for the event loop, garbage collector, and bare-bones startup instructions.

Additionally, RAM usage of a firmware application will be made smaller because the compiler can infer types. For example, an integer field or variable number might take 4 bytes (as opposed to Espruino’s 16 bytes). Type inference will not always succeed, so real-world performance will likely be somewhere in between.

Code Size

Since the performance page on the Espruino website deals with this issue specifically, I’m going to take a little more time to address it. The page has quite a nice demonstration of the program sizes for an example function that draws a Mandelbrot fractal. It provides compelling evidence that JavaScript source code can be similar in size than the compiled binary produced by GCC from an equivalent C function. When minified, the size of the JavaScript source text is close to half the size of the GCC output.

This means that if you are taking C functions and writing them in JavaScript, like in the example, you could actually save on flash memory by writing your code in JavaScript in Espruino.

But I think this misses the point of what makes JavaScript so great. I don’t want to write JavaScript so that I can write C-style code in JavaScript syntax. I want to leverage all the things that make JavaScript great.

Let me use a different example to highlight what I mean. In C, if I want to write a function that converts a string to title-case, I might do it like this:

void titleCase(char *sentence) {
  bool capitalizeNext = true;
  for (char *c = sentence; *c != 0; c++) {
    if (capitalizeNext)
      *c = toupper(*c);
    capitalizeNext = (*c == ' ');
  }
}

int main() {
  char sentence[] = "hello, world!"; // Does this initialization work? Not sure.
  titleCase(sentence);
  puts(sentence);
}

If I was to do it in JavaScript, I would do it like this:

  1. I happen to know that there is an NPM library called change-case which can do this. I found this by spending about 30 seconds Googling for it sometime back.
  2. Look at the readme on the main page — yes, it supports what I want
  3. yarn add change-case 1
  4. const { titleCase } = require(‘change-case’)
  5. console.log(titleCase(‘hello, world!’));

The JavaScript way is better for a developer for many reasons, which JavaScript developers are probably all familiar with:

  • It’s less of your code, so it’s more maintainable (other people can maintain the dependencies, and their work will be leveraged by the 1000s of people that use their library).
  • The writers of the library have probably spent time to think about edge cases that you may not have thought about.
  • The C version is brand new code, and has not had time to mature to iron out any bugs and edge cases that come with real-world usage.
  • The JavaScript version is used by many people — there are 762 dependents of this package listed on NPM2. This is a form of hardening and improves reliability — the more dependents there are on the code, the more stable and bug-free the code will be.

What code size would this be in Espruino? I don’t know. The project pulls in 18 dependencies. The code for the path we care about is pretty long, involving a case normalization step, followed by a regular expression to add in the title case characters, and a number of different files. I don’t know how Espruino’s module system works (I haven’t looked at it), but there may be overhead there as well. I wouldn’t be surprised if this library pulled in 50 kB of JavaScript code, and maybe if you used Webpack to minify and remove unused code, that might go down to 2 kB — just a complete guess. Either way, it is likely one or two orders of magnitude larger than the compiled C code.

It’s true that I have picked a particularly pathological case to demonstrate my point. But I think the principle stands true. JavaScript developers are more effective than C and C++ developers in large part due to the fact that most of the code in their projects they do not need to write themselves. Conversely, JavaScript library writers are more effective than C++ library writers, because their productivity is multiplied by a larger factor across the thousands of people using their library. But one of the costs here is that libraries need to cater for a wide range of usage scenarios, which makes them bigger and heavier. This is particularly bad for an interpreter.

That tells you why I think it’s a problem for Espruino. But how would MetalScript deal with this?

I see package support as one of the biggest advantages of using JavaScript, so from the beginning this has been one of the objectives of the MetalScript project. There are a number of angles that I’m using to attack this problem:

  1. As described in the previous post, a MetalScript program executes in two phases: it starts executing at compile time in a VM running in the compiler, and then gets suspended and the suspended VM is compiled. Dependencies are loaded at compile time, and so even a complicated dependency tree will not incur runtime loading overhead (e.g. all the require statements and initialization code for a library like this will execute at compile time)
  2. Unreachable code is automatically eliminated by the garbage collector in the compile-time VM, since code in JavaScript is just like any other data value. There is no need for a separate dead-code elimination step — it comes for free. So all the extra unused functionality of a library will just fall away.
  3. A special case of the above two points, is that if a library is only used at compile time, it can be used and freed before the program is even loaded onto the MCU. In C, there have been a number of times where I’ve needed to write a stand-alone tool to pre-generate lookup tables or complex constant structures to be used by a firmware application — in MetalScript, this can just be done in normal code because of the two-phase execution. So if you can find ways to use the libraries at compile time and cache the results for use at runtime, those libraries will actually take up zero flash and RAM.
  4. MetalScript uses global optimization techniques to push constants and other type information through the control flow graph. This allows parts of the application that don’t change to be removed from the binary output. In particular, this allows libraries to use dependency injection and options hashes with zero runtime overhead, provided that the running application does not need to change the injections/hooks and options during the execution of the program.

Syntactic Style Matters

This might be my biggest turn-off with Espruino. Since Espruino has a parser and interpreter running on the device to interpret the source text, things like comments and whitespace affect the speed of execution (see here). I don’t need to say much about this. Anyone coming from most a modern programming background will understand why this is a problem.

The solution they propose is minification. Maybe that works for you, maybe it doesn’t. If Espruino has strong support for debugging with source maps, maybe this is fine (does it?).

Performance

I don’t think I need to spend much time justifying this point. On the Espruino website, the same example that talks about whitespace also presents an interesting performance metric. Apparently the following code produces a 4 kHz square wave (it loops 4000 times per second):

while (1) {A0.set();A0.reset();}

In C, I would imagine similar-intentioned code to be about 1000 times faster. Perhaps this example is particularly bad, but I think it goes without saying that interpreting text on the fly is not going to be quick.

MetalScript will deal with this by being compiled. I imagine MetalScript will be comparable in performance to C for this example, since most of the program remains constant and would be optimized out by the symbolic executor I spoke of earlier (the variable and property lookups, and the function calls).

Conformance

Espruino claims 95% compatibility with the ECMAScript specification, and apparently something of a mix between ES5 and ES6 at this point. 95% is pretty poor in my opinion. This means I might often run unit tests on my code using node.js and it all passes, and then find the code behaves differently when I download it to the device.

The FAQ says that if you’re writing “normal JavaScript” then you probably won’t have a problem, but it’s not clear what exactly it does or doesn’t support, so it’s a bit hit-and-miss.

My biggest concern with lack of conformance is that it will impact the ability to include third party libraries. It’s all very well writing your own code from scratch to use Espruino, where you can work around various unsupported features or deviations from the spec. But when it comes to NPM packages that are not specifically designed for Espruino, you get what you get, and its the difference between spending 5 minutes to use an off-the-shelf library to do something vs spending a month writing and debugging it yourself.

I can’t even say for sure that the earlier change-case library example will actually work in Espruino — please can anyone who has an Espruino try including it into a project and tell me what the experience is like? Does it work? How big is it? How easy to integrate compared to installing a dependency for node.js?

Professional Workflow

The quick start guide get’s you up and running pretty quickly with code on the REPL in a terminal, or single scripts in their Chrome app IDE. But then what? If I’m a real professional developer, and want to create a real-world product, the REPL will be an interesting novelty for the first 5 minutes of the project, and then I’ll be asking,

How do I do real development on this thing?

What do I mean by “real development”?

  • Creating a project in a way that it can grow to thousands of source code files
  • Setting up and a professional IDE to run, edit, and debug the source code using all the modern tools you would expect for such.
  • Structuring your project, and bringing in drivers and libraries as needed

Perhaps someone with more experience with Espruino can tell me where the missing starter guide is for a JavaScript professional who wants to make a real product. As it stands, Espruino and its Web IDE look like educational tools for children and hobbyists3.

Unfortunately, I would say that if Espruino is not good for professionals, then it’s also not good for beginners. The reason is that if you are a beginner in something, you are rather going to want to learn to use a tool that will carry over to larger and more professional projects when you outgrow your noob shoes.

How is MetalScript intended to be different?

  • MetalScript will be a command-line tool with a similar CLI to node.js. It will be easy for existing JavaScript developers to get started with this interface since it’s familiar, but critically it will be the same interface that can be used at scale in professional work.
  • The interface to MetalScript will include the node-inspector debugger API, so that IDEs such as VS Code can be used seamlessly with MetalScript for a full, modern editing and debugging experience.
  • The two-phase execution feature means that modules can be unambiguously resolved and loaded at compile time, so only the path of the entry script needs to be provided to MetalScript, like it is with nodejs. There is no need to have separate project files, manifest files, or configuration files, to work with larger projects.

Conclusion

A lot of what I’ve said is speculation. I’m speculating about Espruino based on what I’ve read online, and I’m speculating about how MetalScript will be based on the way I’ve designed it and the way the proof of concept is going. When MetalScript is working, this topic should be revisited.

If you’re an Espruino user, I’d love to hear your experience with it. Have I given it a fair chance? Where are the areas where you think it excels or suffers? What are the lessons that MetalScript should take from it?

Please feel free to share your rants and disagreements with me in the comments.

 


  1. or npm install --save change-case 

  2. These are only the ones listed, which only include public projects that are also registered on NPM — think how many thousands more projects use this library that are not listed 

  3. This is not necessarily a bad thing, but it is not what I need 

MetalScript: The Idea

MetalScript: The Idea

People who know me, or have been reading my blog for a while, probably know that I’ve been working on a JavaScript firmware compiler, which I’ve called MetalScript. For the first time ever, I’m going to share a little about it with you, my beloved blog readers.

If you’re not interested in the story behind MetalScript, skip about halfway down this post to the Enter: MetalScript heading. I’ll warn you ahead of time that this post is pretty long, because this is a topic I’m really passionate about.

My Story

I first started to learn to work with firmware and microcontrollers when I was 12. My dad is an electrical/software engineer, and maintained a workshop in one of our garages, with many different kinds of electronic components and tools. He showed me how to put together a basic microcontroller to control some LEDs for an electronic dice1 I did for a school project. I must admit that I didn’t absorb much during that project — I was becoming a teenager, and had just moved schools, and although I was quite proficient at programming by that stage, electronics was still a dark art to me. But through the following few years I become more familiar with it, and it’s nevertheless quite interesting looking back on the experience of I had when I first was introduced to the tools.

The first microcontrollers I learned to work with were in the Microchip PIC family2. If I recall correctly, a simple circuit for a PIC to control an LED might have just a handful of components:

  • A battery (or bench power supply)
  • A voltage regulator, to convert the power from the battery into the 5V needed to run the MCU. Note that some microcontroller don’t need a regulator and can work directly from the battery.
  • The microcontroller, which is a like a tiny computer in a single chip (without any peripherals on its own such as a display or keyboard etc), having its own RAM, ROM, and CPU built into the single chip.
  • A crystal oscillator — this is the one part of the “tiny computer” that couldn’t be put into the MCU chip (in some MCUs it can be, but not the one I was working on). It’s not important what it does, but just that it’s one of the “ingredients” in creating a simple MCU circuit.
  • An LED (with a resistor to limit the current), as an example, so we actually have something for the microcontroller to control

That’s fewer parts than some kitchen appliances, to give you a tiny computer that you can write software for! (And as I say, some MCUs require even less to get started, if they have internal RC oscillators and are more permissive on the input power).

I don’t have a photo of those early circuits (digital cameras weren’t really a thing back then), but here’s a photo from a few years later of a slightly more complicated board, that has 4 LEDs and a monochrome graphics display (the graphics display is the most complicated part, requiring a lot of data lines and a charge pump to drive negative voltages for the backlight of the display).

Side note: this photo was taken in my bedroom3.

I won’t go into detail on how these are made. There are plenty of online resources to guide you through DIY electronics. The main thing I want to highlight is how simple it was. Once you know how, you can solder together4 a working microntroller circuit from scratch in probably 30 minutes5.

So what about the code?

For these Microchip PIC microcontrollers, you downloaded some free software (MPLAB I think it was called). You went through a wizard to create a new project, where you selected the type of microcontroller, and some other basic stuff, and it created a new project file and a main C file (or assembly — depending on your choices). There were just 2 files you cared about — the code file, and project file. You write your C code into the code file, and click a little button in the IDE to put the code on the MCU and make it run.

I’d characterize this process as “relatively simple”. You write code, and you click the button to download and run it, and that’s it.

To know how to write the code, you would read the datasheet — I emphasize that this was a single datasheet, that concisely described pretty much everything you need to know for the project, including pin-outs, memory layout, IO registers, example code in C, electrical specifications, some reference circuits, and even a description of the instruction set for the device. I’m hugely disheartened these days when I have to refer to 5 different documents written by different companies in order to figure out how to use something.

Writing firmware wasn’t as easy as writing desktop application code, but it was ok. The IDEs were a bit worse, the debuggers a bit less predictable, and there was more work to understand the platform you’re on. It wasn’t a blissful experience, but it was marginally acceptable.

The state of firmware development today

The state of firmware development over the last two decades in my mind has gone downhill from “marginally acceptable” to “a hideous monster” . Microcontrollers have become more powerful, which is a great thing, but with it has come the opportunity for astronomical amounts of complexity.

To get a minimal working example (a flashing LED) using a modern embedded architecture, you are quite possibly going to need:

  1. A CPU
  2. RAM
  3. A flash chip or SD card
  4. Possibly a ROM chip to hold the bootloader
  5. A ton of supporting circuitry

If you want to assemble the circuitry yourself, bear in mind that many modern day CPUs and MCUs don’t come in DIP packages (the kind I showed in the photo earlier that you can solder into some veroboard yourself), they come in impossible-to-hand-solder packages like BGA. So you’re going to need to buy some adapters, but finding the right ones is going to be tricky. You may just want to buy a module that includes all of this complex circuitry pre-packaged but uncustomizable. Or you can spend 3 weeks having a proper PCB fabricated. There is no good option.

With this setup, you no longer just have C code that is compiled to run directly on the device. Rather, you have a software stack, starting with a bootloader such as U-boot, which reads a file system to load a Linux kernel, and the Linux kernel boots the operating system (the set of device drivers and such), and at some point invokes a startup script, which invokes your application.

If the application wants to toggle an LED, it can no longer set a bit in an IO register. Rather it will probably use a software framework which tells the Linux kernel that it want to output to the GPIO, which dispatches the request to the GPIO driver, which in turn toggles the bit on your behalf. I can’t even confirm to you that this is exactly what happens, despite doing months of research and having spent a fair amount of time writing my own application code in such an environment, because it’s so bloody complicated.

I’ve found that actually compiling and getting the application code onto the device is equally nightmarish. You are now essentially working with two computers: your development computer and a tiny embedded computer running a different (but almost as complicated) operating system, and getting code on your computer to run on the device might involve network drives, or FTP servers, remote debug servers, etc. This is all required just to get a device that blinks and LED.

You might be able to hide from these details for a short while. You may buy a development kit which already has all these things pre-loaded on it. But sooner or later you will be exposed to these details. Say, you need to install a driver for something, or you need to update/configure the kernel, or you need to figure out how to boot from a remotely-hosted image to debug something, or whatever. The abstraction is leaky, and sooner or later you’re going to need to enter that rabbit hole, and it ain’t no wonderland.

I’m not exaggerating when I say that this seems to be the modern way of turning on and off a few LEDs. I spoke to a guy the other day who tells me that the LED street signs here in Melbourne are each running Linux. In fact, each sign runs a full web server stack that you could log into if you wanted. These signs are literally just controlling a panel of LEDs, and somebody decided that the best way to do this was to run a full Linux operating system. I would speculate that they chose this route so that they could run the web stack to make the devices remotely configurable, but I would argue there are better ways to do it, if the tools existed. Imagine the size of the team and expertise that were probably required to do this software.

I don’t care much that it uses more power or costs more to run Linux on a sign, since the incremental cost is negligible compared to the cost of the sign, and power is not an issue here. What I’m concerned about is the deep expertise required to do it, which makes it difficult for the average guy to do something similar, and difficult for beginners to get into. And to reiterate, buying a prepackaged module doesn’t solve the complexity, it just delays your exposure to it.

The JavaScript Perspective

I’m lucky enough to be a software engineer who works full-full stack. That is, I write software for everything from microcontrollers, to websites, and in between. It exposes me to a wide range of tools. One of the those tools is JavaScript6. Whenever I mentally context switch from JavaScript development to embedded C or C++ development, it feels like I’m driving into a pit of smelly mud. C++ is full of landmines with memory corruption and undefined behavior. Error messages are complicated. The build process is complicated, involving C preprocessing, template preprocessing, compilation, object files, linker scripts, etc. It cripples your ability to write proper abstractions, or reusable code. I’ve talked about these problems in past blog posts.

JavaScript doesn’t have these issues. There is no preprocessing, no template processing, linker scripts or intermediate object files. A JavaScript program cannot corrupt memory, and there is no such thing as undefined behavior7. You can easily write abstractions to reduce the complexity of your code.

JavaScript is also easy for beginners to learn. But perhaps most importantly, it’s easy in JavaScript to integrate third party code in the form of reusable libraries. In C or C++ it might take a week to get set up and familiar with a reasonably-sized foreign library, while in JavaScript it can be a matter of minutes.

Enter: MetalScript

What I am trying to achieve with MetalScript is to bring back the simplicity of the “good ol’ days” of microcontroller development, but bringing it up to modern development standards by supporting JavaScript rather than C or C++.

I want to see a world where you can write a firmware program in JavaScript, and with a single command, run the JavaScript program on the microcontroller. No preparation required (other than plugging the device in). You don’t need Linux, or multi-stage bootloaders. You don’t need project files and linker scripts. You just need your main.js file, and a microcontroller to put it on.

This is really important, so I think it bears repeating. What I am trying to make, is

A tool to that allows you to put a JS program onto an MCU and run it.

No fuss. No linker files. No manifest files, or configuration files of any sort. No make files. No pre-downloading a runtime or an interpreter or a VM or operating system. No 10 step installation process with third party tools and dependencies which “almost” work together.

How will it work?

Similar to the how a C program was run in the old Microchip MPLAB compiler/IDE I spoke of earlier, MetalScript will put a JavaScript program onto your device by first translating it to a format that the device can natively execute. This step is traditionally thought of as compilation, and that’s why I refer to MetalScript as a compiler.

The compilation will produce a result that is self-contained, in that it requires no extra dependencies to be loaded onto the device in order for it to execute. It runs bare metal (hence the name MetalScript), not requiring any supporting interpreter or operating system. In a typical use, especially by beginners, the build artifacts will be loaded directly onto the device and the user won’t even know about it. More advanced workflows will exist for those who want to compile as a library or create an image.

Runtime Library

When C firmware is compiled and loaded onto a device, it’s actually not your C code that starts to run when the device is powered up. The compiler typically inserts some assembly code called the startup script into the executable, and this does some useful initialization such as wiping the memory and initializing global variables. MetalScript is in a way lower level than C, in the sense that the startup process is invoked and controlled by user-written application code, by calling a function called mcu.start(). A typical LED blinking example might be written as follows:

// Run startup routine with the default options (this initializes the processor
// clock, allocates heap and stack memory, initializes the garbage collector, etc)
mcu.start(); 

// Create a timer to call `tick` every 200 ms
setInterval(tick, 200); 

function tick() {
  mcu.gpio('D6').toggle(); // Toggle GPIO pin D6 to blink LED
}

MetalScript is similar to a typical C toolchain, in that it incorporates a standard runtime library during the compilation process. In C, this provides common functions such as strlen, but in JavaScript this provides the garbage collector and event loop.

No Preprocessor

In C, there are two languages: the preprocessor language, and the C language. The preprocessor language is executed at compile time, while the C language corresponds to statements that are computed at runtime. This is complicated and difficult for a beginner to learn, and it’s even more complicated if you consider linker scripts and make files.

In MetalScript, by contrast, there is only one language — everything is JavaScript. Code that happens before mcu.start() is executed at compile time, while everything after mcu.start() is executed at runtime. The set of operations supported at compile time and runtime are different. For example, the code can require  (like #include ) other modules at compile time but not runtime. Timers can be set at compile time or runtime, but they will not start ticking until the MCU is started (i.e. until runtime).

The compiler achieves this by executing the JavaScript program as a process in a virtual environment during compilation, and then when the program calls mcu.start(), the state of the process is suspended and it is this suspended process that is actually compiled, not the source code files (the source code becomes part of the process when it is loaded via import or require statements). For more information about this, take a look at my post on using MetalScript in a C project, which provides more technical details.

There are a few advantages of having it done this way. One of them is very fast runtime startup time, since all the heavy initialization work is already done at compile time, which may include building arbitrarily complex structures in memory for the initial state of every module or driver. Another advantage is that user-written JavaScript code has the capacity to control the compilation and startup process, such as influencing flash memory layout, choice of garbage collector, processor clock source, etc. In a small example, this could all happen within the main.js file, and it can have code, variables, and third party libraries shared between all phases of the program8.

A big advantage of this approach is that there is only one language to learn — no need for newcomers to learn macros, C++ templates, linker scripts, and make files. And definitely no need to learn how to write Linux drivers.

Whole Program Optimization

MetalScript compiles the whole program at once, which is one of the reasons why it is initially intended to target firmware and not desktop or server applications, since firmware is necessarily smaller and more self-contained. This gives MetalScript the opportunity to use global optimization techniques, being able to trace values as they flow across the program to do constant propagation and type inference.

This in turn allows library writers to create very versatile libraries, that cover a broad range of use cases, without worrying about performance. This is often done in JavaScript by providing a set of options at the point of construction of a module. When the library is compiled into the firmware application, the global optimizer is able to specialize the library for the specific use by propagating constant option values through the library and eliminating unused code.

The performance of MetalScript will be good, both because of the fact that it is compiled rather than interpreted, and because of these whole program optimizations.

Where should it be used?

I think MetalScript will be useful for small-to-medium sized microcontrollers, in the range of 2kB to 500kB of RAM9. In larger devices, the size of the program may negatively impact compilation times. The garbage collector and runtime will have some overhead, and so devices that are too small may not leave enough space for program operation.

For the initial version of MetalScript, I will be targeting one specific microcontroller (not chosen yet), and then this can be extended in future.

MetalScript will not be suitable for systems with hard-real-time constraints, since the performance of any particular piece of code depends on the rest of the application code (as a result of the whole program optimization), and there is no particular upper bound on how slowly a specific piece of code may run, even though performance is expected to be good on average.

Is it proper JavaScript?

Yes.

We’re talking real, modern ECMAScript script, conforming to the specification. By design, you will be able to include pretty much any existing pure-JavaScript NPM package that doesn’t depend on the browser or node APIs. You can use node to execute unit tests if you like, and you can write common libraries or shared code that will run on the browser, the website backend, and the MCU, to capture concepts and models that are common to your IoT domain (if that’s what you’re doing).

The only thing that will knowingly not be supported is eval, since that would require an interpreter running on the device.

TypeScript?

Yes.

As mentioned above, MetalScript will consume real JavaScript source code, and so it will work fine with the TypeScript transpiler. In fact I would strongly encourage you to write your firmware in TypeScript if you can. It will not affect runtime performance, but it will help you eliminate certain types of coding mistakes.

How much will it cost?

In order to continue improving it, working on new generations of the compiler, better library support, and keeping up to date with the latest ECMAScript features, I will need to figure out how to get some money from the project in order to support ongoing development. I haven’t yet figured out how is the best way to do this. If you have ideas, let me know. Likely the compiler will be free for certain types of use and level of support.

When can I have it?

This is a very big project, and I have been working on it for some years now. I have a proof of concept that demonstrates most of the core features, such as the compile-time execution, suspending the process, and running through a number of compilation steps to get an output ELF file. There is a lot more to do — I am still working on the type inference phase, debugger support, and then need to implement all the JavaScript features.

I am not working against any specific timeline, but since I’m doing this in my spare time, it could take a while still. If you’re in a hurry to have it, contact me (see my contact details on the about page) — I will accept motivational pleas, constructive criticism, monetary donations, or a helping hand. I don’t have a Patreon account, but if you want to support me through something like that then let me know and I’ll set one up.

Probably the best way for you to help me out, is to let me know that this is something you want, and share it with your friends on Facebook or Twitter or your favorite forums etc. The biggest impediment to my progress is trying to maintain the motivation to stick with it, day in and day out, and the best way for me to stay motivated will be for me to know that there are people out there who are waiting for it and counting on me to deliver.


  1. Technically the singular of dice is die, but I think “die” is a pretty overloaded word so I’m intentionally using the incorrect word for clarity — after all, the point of writing is to communicate. 

  2. It’s confusing as heck that a company named Microchip made microchips 

  3. I painted my bedroom half sky blue and half terra cotta (although my wive says it was more like orange — take your pick), split down the middle with various patterns at the junction between the two sides. Did you know that when I applied to university, I applied for both art and electrical engineering? 

  4. I have much fonder memories of soldered veroboard circuits than “breadboard” circuits, since the latter tends to get damaged easily 

  5. It helps a lot having component trays at the workbench filled with all the common components you need for this kind of thing 

  6. not “Java” — please do not call JavaScript “Java” for short, since these are two completely different languages 

  7. There is a small amount of implementation defined behavior, but this is a different thing in practice 

  8. It could also be done in multiple files for the purposes of code organization, but I highlight that it can be done in a single file to emphasize that this is not just a case of “linker scripts written in JS” but rather a completely new paradigm. 

  9. I am classifying 500kB of RAM as “medium” because I’m projecting into the future 

Global To-String
JavaScript Corners - Part 9

Global To-String
JavaScript Corners - Part 9

(This is Part 9 in my series on JavaScript corner cases).

Here’s another one.

In JavaScript, global variables are properties of the global object. By default, the global object is like any other, and inherits from the Object.prototype  object. Object.prototype comes with a number of its own properties, such as the toString method. So, that means that toString is also a global variable1.

console.log('toString' in global); // prints true
console.log(toString === global.toString); // prints true
console.log(global.toString()); // prints [object global]
console.log(toString()); // prints [object Undefined]. Why is this?

Everything seems expected, except the last line, which might seem a little confusing. The toString() call is clearly invoking a function using a reference to that function, where the base of the reference is the global object, right? (Take a look at my posts on references). So surely toString() and global.toString() mean the same thing?

Wrong.

There’s a subtlety here. The unqualified toString reference actually has a base value2 that is the global environment, which “knows about” the global object, but is not exactly the global object. The base object for the global environment is actually always the value undefined. See here in the spec. This is why it prints “[object Undefined]” .

 


  1. To qualify as a global variable, there is actually an additional criterion. The property of the global object must not be listed in the set of unscopables on the global object. In this case, toString is not listed as an unscopable, since it was introduced into JavaScript before the existence of the unscopables feature, and for backwards compatibility it remains that way. 

  2. Recall that a reference has two components: the thing being referred on, and the name of the thing being referred to. For example, referring to the property named x on the object obj, in the case of obj.x 

JavaScript Corners – Part 9
Node.js With-Statement Bug

JavaScript Corners – Part 9
Node.js With-Statement Bug

What does the following evil code print?

var x = 'before';
var obj = { x };
with (obj) {
  x = (delete x, 'after');
}
console.log(x);

If you’re not sure, don’t worry — neither are current JavaScript engines. Firefox prints “after”, while Edge, IE, and Node.js print “before” (node v7.9.0). I believe that Firefox is correct in this case.

The tricky statement is obviously the following one, which sets a property on an object in the same statement that deletes the property:

x = (delete x, 'after');

(Side note: if you’re not very familiar with JavaScript, the relevant language features that are being used here are the delete operator, comma operator, and the good ol’ evil with statement).

What we expect to happen

The statement var x introduces a new variable at the script scope1.

The { x } expression creates a new object with a single property2 x, where the value of x  is copied from the variable x in the outer scope, so it has the initial value of ‘before’.

The with  statement brings the properties of the object obj into scope in a new lexical environment.

The statement x = (delete x, ‘after’) should perform the following steps:

  1. Evaluate the left hand side
  2. Evaluate the right hand side
  3. Assign the value from the right hand result, to the reference created when evaluating the left hand side

When the left hand side is evaluated, the property x will be found in object obj. The base value of the reference is the object, not the script variable scope.

The right hand side evaluates to ‘after’, but in the process it deletes the property x  from obj. However, the reference on the left hand side should still refer to “the property named ‘x’ on the object obj“, even though the property with that name is now deleted.

When the assignment happens, it should create a new property named ‘x’ on object obj, with value ‘after’. The variable x in the outer scope should be left unaffected.

In this case, I think Node.js gets the wrong answer.


  1. Theoretically, the script scope is the global scope. But in Node.js, scripts are wrapped in a module wrapper that changes the behavior of global vars. This doesn’t affect the outcome of this experiment though 

  2. Bonus fact. Object literals inherit from the global intrinsic object Object.prototype, which has other properties on it, such as toString. So when I say that it has a single property, it would be more accurate to instead say that it has a single own property 

C++ vs JS
Binary Type Coupling

C++ vs JS
Binary Type Coupling

I was chatting to some of my colleagues at work recently about the potential benefits of a JavaScript compiler vs a C compiler, when targeting an embedded MCU. A concern that came up multiple times regarding different features, is

…but you can already do that in C, so how is JS better?

This is a good point. If you’re given a problem, you can solve it in JS, or you can solve it in C (or C++). Is JavaScript really any better? Of course, you know my answer, which is probably pretty biased. But let’s dig into a few of the details.

Firstly, let’s put aside the question of performance for a moment. I argue that JS is going to yield a more performant program1, while I’m sure that most C programmers will argue that C will perform better. Probably both are right, under different circumstances. Let’s just side step this issue for the moment.

There is at least one very significant reason why I think JS has a lot more to offer than C or C++, and that’s the ability to write reusable code. This is the gateway feature that opens the door to a 10x improvement in productivity, because it means that you can leverage highly customizable libraries to do most of the heavy lifting, rather writing the code yourself. Yes, in C and C++ you can use third party libraries. But in JS you can do it an order of magnitude faster. There are many, many times, where it’s taken me literally less than 5 minutes to go from “I wonder if there’s a library that already does this”, to using the library and moving on to the next task.

The same cannot be said for any third party C or C++ library or code file that I’ve ever used, which typically take days or weeks of integration work, reading documentation, figuring out why it doesn’t compile2, and then debugging cryptic issues.

So why is this the case? Is it the package managers? Is it the community? Is it the language?

I’m guessing it’s multiple reasons, but I’m going to focus on one in particular in this post: binary coupling (at least that’s what I’m calling it).

An Example: a Sum Function

Let’s take a dummy example. Consider a function that adds together an collection of numbers (sound familiar?). In JavaScript, you can be certain it will almost definitely have a signature like this:

function sum(numbers);

We can use TypeScript to make things even more unambiguous, although this doesn’t change the meaning of the code:

function sum(numbers: number[]): number;

This function logically takes a collection of numbers. In JS, collections of this nature are represented as arrays. The output is the sum, which is obviously also a number.

What might this look like in C or C++? Here are some options:

int sum(const int* numbers, size_t count);

int sum(const int* numbers, size_t count) __attribute__3;

// Null terminated
uint32_t sum(const uint_32_t** numbers);

int sum(const std::vector<int>& numbers);

template <int Length>
int sum(int numbers[Length]);

template <int Length>
constexpr int sum(const int &numbers[Length]);

// Using an STL-like input iterator (see std::accumulate)
template <typename Iterator>
int sum(Iterator begin, Iterator end);

// Is this even possible? (passing values at compile time)
template<int ...values>
int sum();

Forgive me if I have some or all of these wrong. In my life, I’ve programmed much more in C++ than in JavaScript, but C++ always remains a little bit too complicated for me to remember all the subtleties.

What’s the difference between all of these options? In my mind, they are all logically the same thing. They represent a function to which you pass an “array” of numbers, and it give you the result.

The difference comes mainly down to how a client interacts with these functions at a binary level. What calling convention is used? How is the array represented in memory? What parameters or aspects of the function are available at compile time vs runtime?

If sum was a third party library, which of these forms would it take?

We have at least one good answer to this question, since the functionality is already part of the STL in C++. It takes the following form:

template< class InputIt, class T >
T accumulate( InputIt first, InputIt last, T init );

There’s nothing surprising about this. Since we’re talking about the standard template library, you expect this to be a function template. It’s a fairly generic solution, but certainly it won’t suit everyone. Because it’s a template, you can’t take the pointer of it, you can’t pass it around as an argument to non-template code, you can’t export it directly as a library function from an object file, etc. For example, try to translate the following JavaScript code into C++:

function aggregate(useSum, arr) {
  return (useSum ? sum : average)(arr);
}

The useSum and arr value might or might not be available at compile time — a reusable piece of code shouldn’t assume one or the other.

How C++ Usually Solves This

The solution to this in C++ is normally to shunt everything to runtime when you need reusability. As a case in point, take a look at the GPIO methods that the mbed library provides. To set a single bit that represents a GPIO pin, you would use a function with the following signature:

// Set the output, specified as 0 or 1 (int) 
void DigitalOut::write(int value);

This makes for fairly reusable code. But it comes at a cost. It invokes a function call4, using a runtime value. The function then needs to use indirect accesses and masks to figure out what bit to set, internally using a structure that looks like this:

typedef struct {
    PinName  pin;
    uint32_t mask;

    __IO uint32_t *reg_dir;
    __IO uint32_t *reg_set;
    __IO uint32_t *reg_clr;
    __I  uint32_t *reg_in;
} gpio_t;

The absolute cost here is a low – a function call is not a big deal, masking and indirection are quick, and passing an integer argument at runtime is not a big deal. But the relative cost is high — if value and the specific port happened to be known at compile time, this whole thing could have been one machine instruction. Now it has to be many more, and extra memory overhead.

But surely in JavaScript this is even worse? Isn’t everything at runtime in JavaScript?

I would argue “no”. In JavaScript you don’t specify what gets done at compile time vs runtime — you leave it up to the JavaScript engine to decide. This is why I use the term “binary coupling”. In C++, your code is coupled to the binary representation of the resulting machine code — you are forced to make certain decisions, and those decisions force the C++ compiler to use certain representations. This makes C++ a glorified assembly language — it’s a way for you to write machine code in a somewhat abstracted way.

While in JavaScript, you don’t write machine code. You specify the behavior of the program in unambiguous terms, and leave it up to the engine or compiler to implement the specification in terms of machine code.

You get a small taste of this in C++ with micro-optimizations and inlining. You write a function, and in some circumstances the compiler is free to chose not to represent that as an actuall CALL instruction in the output binary. This is an example that breaks the usual binary coupling, since the code does not dictate the binary interface. But this kind of behavior is necessarily very limited in C++.

The discussion here is not primarily about performance, it’s about reusuability. What I’m demonstrating is that in C++, it’s impossible to have a library API that doesn’t also dictate the binary ABI that is used to access that library. As a result you’re much less likely to find an existing library that serves your specific purposes, since you need to match not only the logical interface, but the binary interface as well (and the type interface, which is another story).


  1. Not referring to interpreted or JIT-compiled JavaScript of today, but a hypothetical bare-metal compiler 

  2. Did you leave out a macro configuration flag somewhere? did you use the right make file for your platform? are the include files out of date? were you supposed to use the .dll files, or the .o files, or the .a files? the instructions are for GCC, how do I do this in VC++? the instructions are for version 1.34 but only version 1.32b is ported to my system, and for some reason the make file doesn’t run 

  3. stdcall 

  4. The class itself is a small inline wrapper around the HAL, but the HAL is implemented as separate object files 

JavaScript Corners – Part 8
References (Continued)

JavaScript Corners – Part 8
References (Continued)

Given an object o  with a member function f  that prints out what the this value is:

const o = {
  f() {
    console.log(
      this === global ? 'global' :
      this === undefined ? 'undefined':
      this === o ? 'o':
      '-');
  }
}

We know what the following prints:

o.f();  // prints "o"

And we know what the following prints1:

const f = o.f;
f(); // prints "global"

I always thought that the difference came down to the fact that o.f()  is actually invoking a different operator — something like a “member call operator”.

However, what do you think the following prints?

(o.f)();

My guess, up until today, would have been that this prints “global”, since with the parentheses, this is no longer invoking the member call operator, but is instead invoking the call operator.

But I was wrong. There is no such thing as a “member call operator”. Rather, the “call” operator just behaves differently depending on whether the target of the call is a value or a reference2.

So this actually prints “o”.

(o.f)(); // prints "o"

But hang on. Why didn’t the parentheses coerce o.f to a value?

One might have expected the parentheses to automatically dereference o.f, something like the following examples that use the logical OR and comma operators to coerce the target to a value instead of a reference:

(o.f || 0)(); // prints "global"
(0, o.f)(); // prints "global"

Indeed, this could have been the case for bare parentheses as well, but the language designers chose not to do it that way, so that the delete and typeof operators still work when extraneous parentheses are provided:

delete o.f; // The "correct" way to delete a property
delete (o.f); // This also works

 


  1. assuming the use strict directive isn’t provided in this case 

  2. To be more accurate, the target also behaves differently depending on whether the target reference refers to a property of an object vs a variable in an environment record 

JavaScript Corners – Part 7
Calls and With Statements

JavaScript Corners – Part 7
Calls and With Statements

Here’s a quick one. What does the following print? (Assuming not in strict mode)

function foo() {
  console.log(this.name);
}

const bar = { foo, name: 'Bar' };
global.name = 'Global';

foo();         // Case 1
bar.foo();     // Case 2
with (bar) {
  foo();       // Case 3
}

In non-strict mode, the naked function call foo() gets a this value that is the global object. So the first case prints “Global”.

In the second case, we’re invoking foo as a member of bar, and so the this value is bar (it prints “Bar”).

The last case is the most interesting, and the most useless (since with statements are strongly discouraged, and cannot be used outside of non-strict mode). The this object in this case is actually bar. JavaScript recognizes that the function foo here is being invoked within the context of a with statement, and implicitly uses the bar object. This prints “Bar”.