Category: JavaScript

Snapshotting is like compiling but better

Snapshotting is like compiling but better

TL;DR: The final output of a traditional compiler like GCC bears a family resemblance to a Microvium snapshot, but the snapshotting paradigm is both easier to use and more powerful because it allows real application code to run at build time and its state to persist until runtime.

What is snapshotting?

My Microvium JavaScript engine is built on the paradigm of creating a VM snapshot as the deployable build artifact rather than creating a traditional compiled binary. As a developer, when you run the Microvium engine on your desktop with a command like microvium main.js it will execute the script until all the top-level code is complete and then output a snapshot file containing the final VM state. The snapshot file can then be “resumed” on an embedded device using Microvium’s embedded C library (for more details, see Getting Started). Although Microvium is designed especially for microcontrollers, the principle of snapshotting goes beyond the embedded space.

Comparing to GCC

For this post, I’ll mostly compare Microvium to GCC:

gcc main.c         # Compile a C program with GCC
microvium main.js  # "Compile" a Microvium program

These two commands are analagous. Both produce a single file as the result, and this file is what you want to deploy to the target environment. In the case of GCC, the output is of course the executable (e.g. a.out), while in the case of Microvium, it’s the snapshot (main.mvm-bc).

Both of these commands do some kind of compilation as part of the process. GCC translates your function code to machine instructions, while Microvium translates to virtual machine instructions (bytecode instructions).

Constants

Both the GCC output and the Microvium output have a section for constants, including function code. You may be familiar with this as the .text section. Among other things, this contains constant values, such as:

// JavaScript
const x = 42;
// C
const int x = 42;

… but better

You can do this in Microvium but not in C:

const x = foo();

function foo() {
  return 42;
}

In C, it’s a compile-time error to call runtime functions for the calculation of constants. But in Microvium, this is perfectly legal since there is no distinction between compile-time and runtime — there is only really runtime before the snapshot and runtime after the snapshot. Although informally we may refer to the former as “compile-time” and the latter as “runtime”.

Apart from just being cleaner and easier to use, the Microvium snapshotting paradigm here allows computationally-intensive constants to be calculated at build time, using arbitrary functions and libraries that might also be useful at runtime.

Variable initializers… but better

For non-constant variables, both GCC and Microvium have an output section for the initial1 value of all the variables, which is copied into RAM at runtime. You may know this traditionally as the .data section.

But a major difference between them is that a Microvium snapshot also contains heap data.

// JavaScript
let arr = [1, 2, 3];

// C
int* arr = malloc(3 * sizeof(int));  // ! Can't do this (in top-level code)

The best you could do in C for the above example would be to have an init function that runs early in the program to set up the initial runtime state of the program. Snapshotting is better because this initial structure can be established at build time.

Modules … but better

Microvium and C both have support for structuring your code in multiple files which get bundled into the same output artifact:

import { foo } from './foo.js'
#include "foo.h"

With the #include here, the dependent module implementation (e.g. foo.c) is not automatically compiled and linked into the program by GCC — you need to separately list foo.c to be compiled by GCC, or orchestrate the dependencies using a makefile.

But in the case of a Microvium import, the import statement itself is executed at build time, performing the module resolution, loading, parsing, and linking at build time, as well as executing the top-level code of the imported module. The top-level code of the imported module may in turn import other modules, transitively importing the whole module graph and executing its top-level initialization code.

Preprocessor… but better

Both C and Microvium support “compile-time” logic:

// C
#if USE_FOO_1
#define foo foo1
#else
#define foo foo2
#endif
// JavaScript
const foo = USE_FOO_1 ? foo1 : foo2;

Of course, you already knew that, because all the examples so far have demonstrated the fact that snapshotting allows you to run JavaScript at compile time. But I want to emphasize some of the key reasons why the snapshotting paradigm is better for this:

  • You don’t need two different programming languages (e.g. the preprocessor language and the C language).
  • Your “compile-time” code has the full power of the main language.
  • Your runtime code carries over the state from your compile-time code.

So in some sense, this unifies the preprocessor language with the main language. This applies similarly to other “compile-time languages”, such as makefiles and linker scripts. Or if you’re coming from the JS world, consider how the snapshotting paradigm obviates the need for a webpack.config (see Snapshotting vs Bundling).

But what about using the preprocessor to conditionally include different runtime logic, such as for different devices? For example, consider the following C code:

int myFunction() {
#if SOME_CONDITION
  doX();
#else
  doY();
#endif 
}

Of course, it’s easy to see how this example might translate to JS:

function myFunction(someCondition) {
  if (someCondition) {
    doX();
  } else {
    doY();
  }
}

Now we have the bonus that our unit tests can inject someCondition to test both cases.

But doesn’t this code mean that now we have both doX and doY branches at runtime, taking up ROM space? (and someCondition)

That’s why I’ve also been developing the experimental Microvium Boost: an optimizer that analyzes a snapshot and removes unused branches of code. For example, if the analysis shows that someCondition is always true in your program, it can remove it as a parameter from myFunction and also remove the call to doY as dead code. This is still experimental but has shown significant success so far.

Host exports… but better

So far we’ve been considering an executable output from GCC, but it would be more accurate to compare a Microvium snapshot with a compiled shared library (e.g. DLL). Like a shared library, Microvium snapshots do not have a single entry point but may contain many exported functions to be resolved at runtime on the device.

A Windows DLL suits the analogy better than a Linux shared library, so in this section the examples will use msbuild rather than GCC.

Both a Microvium snapshot and a DLL binary contain a section for dynamic linking information — a table that associates relevant functions in the DLL with a number2 so that the host program using the DLL/snapshot at runtime can find them.

In the case of a DLL, you can provide the compiler with a DEF file that tells the compiler what to put into the DLL export table. If you wanted to export the functions foo, bar, and baz from the DLL with IDs 1, 2, and 3 respectively, your DEF file might look like this:

LIBRARY   MyLibrary
EXPORTS
   foo   @1
   bar   @2
   baz   @3
// C
int foo() { return 42; }
int bar() { return 43; }
int baz() { return 44; }

The equivalent in Microvium would be as follows:

// JavaScript
function foo() { return 42; }
function bar() { return 43; }
function baz() { return 44; }

vmExport(1, foo);
vmExport(2, bar);
vmExport(3, baz);

You may have noticed a recurring theme in this post: the Microvium snapshotting paradigm doesn’t require a whole new language in order to do different build-time tasks. In this case, Microvium doesn’t require a DEF file (or a special __declspec(dllexport) language extension), since vmExport is just a normal function. This is just simpler and more natural.

Another recurring theme here is that the snapshotting approach is more powerful, allowing you to do things that are impossible or impractical in the traditional paradigm. Take a look at the following example in Microvium:

// JavaScript
for (let i = 1; i <= 3; i++) {
  vmExport(i, () => 41 + i);
}

This has the same overall effect as the previous code, adding 3 functions to the export table of the deployed binary with IDs 1, 2, and 3 and which return 42, 43, and 44 respectively. But having vmExport be a normal function means that now we have the full power of the language for orchestrating these exports, or for writing an abstraction layer over the export system, or outsourcing this logic to a third-party library.

Side note: a more subtle point in this example for advanced readers is its code cohesion. The single line of code mixes both compile-time and runtime code (vmExport(i,...) and ()=> 41 + i respectively), but keeps the related parts of both in close proximity. This is the difference between temporal cohesion (grouping code by when its run) vs functional cohesion (grouping code by what feature it relates to) (see Wikipedia). A common disadvantage of having separate build-time or deploy-time code (e.g. a DEF file, makefile, linker script, webpack.config, dockerfile, terraform file, etc) is that it pushes you into temporal cohesion, which in turn damages modularity and reusability.

Conclusion

The idea of deploying a snapshot rather than a traditional compiled binary opens up a whole new paradigm for software development. The end result is very similar — a binary image with sections for different memory spaces, compiled function code, constants, initial variable values, and export/import tables — but the snapshotting paradigm is both simpler and more powerful.


  1. In the context of Microvium, the word “initial” here refers to the initial state when the snapshot is resumed, not the initial state when the program is started, since the program starts at build time, and variables in JavaScript start with the value undefined. 

  2. DLL exports can be by name or number, but Microvium exports are only by number, for efficiency reasons, so that’s what I’m using in the analogy here. 

Microvium is very small

Microvium is very small

TL;DR: The Microvium JavaScript engine for microcontrollers takes less than 16 kB of ROM and 64 bytes of RAM per VM while idle, making it possibly the smallest JavaScript engine to date with more language features than engines 4x its size.


I’ve designed Microvium from the ground up with the intention for it to be tiny, and it’s been an absolute success in that sense. Microvium may be the smallest JavaScript engine out there, but it still packs a punch in terms of features.

*Edit: this post was originally written when Microvium was around 8.2 kB of ROM. Since then, new features have been added. As of August 2023, Microvium is now 12 kB.

Does size matter?

Size often matters in small MCU devices. A large proportion of microcontroller models available on the market still have less than 64 kB of flash and less than 2 kB of RAM. These are still used because they’re smaller, cheaper, and have lower power than their larger counterparts. All the microcontrollers I’ve worked with in my career as a firmware engineer have had ≤ 16 kB RAM.

Some might say that you shouldn’t even want JavaScript on such small devices, and certainly in some cases that would be true. But as I pointed out in my last post, juggling multiple operations in firmware can be both easier and more memory efficient if the high-level logic is described in terms of a language like JavaScript, even if that’s the only thing you’re using it for.

Even on larger devices, do you really want to dedicate a large chunk of it to a JavaScript engine? A smaller engine is a smaller commitment to make — a lower barrier to entry.

How does it compare?

If I Google “smallest JavaScript engine for microcontrollers”, the first one on the list is Elk. Elk is indeed pretty tiny. For me, it compiles to just 11.5 kB of flash1. Microvium compiled with the same settings compiles to about 12 kB — in the same ballpark.

What about RAM?

The amount of RAM Elk uses is not pre-defined — you give it a buffer of RAM of any size you want, but it needs to be at least 96 bytes for the VM kernel state. Microvium takes 36 bytes for the kernel state.

But where there’s a massive difference in memory requirement is that Elk requires all of its memory allocated upfront, and keeps it for the lifetime of the VM. If your script’s peak memory in Elk is 1 kB then you need to give it a 1 kB buffer at startup, so its idle memory usage is 1 kB. Microvium on the other hand uses malloc and free to allocate when needed and free when not needed. Its idle memory usage can be as low as 88 bytes. In typical firmware, idle memory is much more important than peak memory, as I explained in my last post.

What about the feature set? This is another area where Microvium and Elk diverge significantly. The following table shows the differences:

MicroviumElk
var, const (Elk supports let only)
do, switch, for
Computed member access a[b]
Arrow functions, closures
trycatch
asyncawait
Modules
Snapshotting
Uses intermediate bytecode (better performance)
Parser at runtime
ROM12 kB11.5 kB
Idle RAM88 BLots
Peak kernel RAM36 B96 B
Slot size (size of simple variables)2 B8 B

The only thing that Elk can do that Microvium can’t do is execute strings of JavaScript text at runtime. So if your use case involves having human users directly provide scripts to the device, without any intermediate tools that could pre-process the script, then you can’t use Microvium and you might want to use Elk, mJS, or a larger engine like XS. On the other hand, if your use case has at any point a place where you can preprocess scripts before downloading them to the device then you can use Microvium.

Comparing with mJS

But Cesanta, the maker of Elk, also made a larger JS engine with more features: mJS, which is probably the closest match to Microvium in terms of feature set. mJS lets you write for-loops and switch statements for example.

Since they’re closely matched for intent and features, I did a more detailed comparison of mJS and Microvium here. But here’s a summary:

MicroviummJSElk
var, const (mJS supports let only)
Template strings
Arrow functions and closures
trycatch
asyncawait
ES Modules
(but mJS does support a non-standard load function)
do, switch, for
Computed member access a[b]
Uses intermediate bytecode (better performance)
Some builtin-functions
Parser at runtime
ROM12 kB45.6 kB11.5 kB
Slot size2 B8 B8 B

I’ve lumped “some builtin-functions” into one box because it’s not a language feature as such. mJS has a number of builtin functions that Microvium doesn’t have – most notably print, ffi, s2o, JSON.stringify, JSON.parse and Object.create. You can implement these yourself in Microvium quite easily without modifying the engine (or find implementations online), and it gives you the option of choosing what you want rather than having all that space forced on you2.

In terms of features, mJS is a more “realistic” JavaScript engine, compared to Elk’s minimalistic approach. I wouldn’t want to write any substantial real-world JavaScript without a for-loop for example. Like Microvium, mJS also precompiles the scripts to bytecode and then executes the bytecode, which results in much better performance than trying to parse on the fly. Engines like Elk that parse as they execute also have the unexpected characteristic that comments and whitespace slow them down at runtime.

But the added features in mJS means it costs a lot more in terms of ROM space — about 4x more than Elk and Microvium.

Microvium still has more core language features than mJS, making it arguably a more pleasant language to work in. These features are actually quite useful in certain scenarios:

  • Proper ES module support is important for code organization and means that your Microvium modules can also be imported into a node.js or browser environment. You can have the same algorithms shared by your edge devices (microcontrollers), backend servers, and web interfaces, to give your users a unified experience.
  • Closures are fundamental to callback-style asynchronous code, as I explained in my previous post.

Conclusion

I’m obviously somewhat biased since Microvium is my own creation, but the overall picture I get is this:

  • Microvium is the smallest JavaScript engine that I’m aware of3
  • In this tiny size, Microvium actually supports more core language features than engines more than 4x its size. Some of these features are really useful for writing real-world JS apps.
  • Having said that, Microvium has fewer built-in functions — it’s more of a pay-as-go philosophy where your upfront commitment is much less and you bring in support for what you need when you need it.
  • The big trade-off is that Microvium doesn’t have a parser at runtime. In the rare case that you really need a parser at runtime, Microvium simply won’t work for you.

Something that made me smile is this note by one of the authors of mJS in a blog posts:

That makes mJS fit into less than 50k of flash space (!) and less than 1k of RAM (!!). That is hard to beat.

https://mongoose-os.com/blog/mjs-a-new-approach-to-embedded-scripting/

I have great respect for the authors of mJS and what they’ve done, which makes me all the more proud that Microvium is able to knock this out of the ballpark, beating what the seasoned professionals have called “hard to beat”. Of course, this comes with some tradeoffs (no parser and no builtin functions), but I’ve achieved my objective of making a JavaScript engine that has a super-low upfront commitment and will squeeze into the tiniest of free spaces, all while still including most of the language features I consider to be important for real-world JavaScript apps.


  1. All of the sizes quoted in this post are when targeting the 32-bit ARM Cortex M0 using GCC with optimization for size. I’m measuring these sizes in June 2022, and of course they may change over time. 

  2. The ffi in mJS is something that would need to be a built-in in most engines but Microvium’s unique snapshotting approach makes it possible to implement the ffi as a library just like any of the other functions 

  3. Please let me know if you know of a smaller JS engine than Microvium. 

Single-threading is more memory-efficient

Single-threading is more memory-efficient

TL;DR: Single-threading with super-loops or job queues may make more efficient use of a microcontroller’s memory over time, and Microvium’s closures make single-threading easier with callback-style async code.

Multi-threading

In my last post, I proposed the idea that we should think of the memory on a microcontroller not just as a space but as a space-time, with each memory allocation occupying a certain space for some duration of time. I suggested therefore that we should then measure the cost of an allocation in byte-seconds (the area of the above rectangles as bytes × seconds), so long as we assumed that allocations were each small and occurred randomly over time. Randomness like this is a natural byproduct of a multi-threaded environment, where at any moment you may coincidentally have multiple tasks doing work simultaneously and each taking up memory. In this kind of situation, tasks must be careful to use as little memory as possible because at any moment some other tasks may fire up and want to share the memory space for their own work.

The following diagram was generated by simulating a real malloc algorithm with random allocations over time (code here, and there’s a diagram with more allocations here):

A key thing to note is that the peak memory in this hypothetical chaotic environment can be quite a bit higher than the average, and that these peaks are not easily predictable and repeatable because they correspond to the coincidental execution of multiple tasks competing for the same memory space1.

This leaves a random chance that at any moment you could run out of memory if too many things happen at once. You can guard against this risk by just leaving large margins — lots of free memory — but this is not a very efficient use of the space. There is a better way: single threading.

Single threading

Firmware is sometimes structured in a so-called super-loop design, where the main function has a single while(1) loop that services all the tasks in turn (e.g. calling a function corresponding to each task in the firmware). This structure can have a significant advantage for memory efficiency. In this way of doing things, each task essentially has access to all the free memory while it has its turn, as long as it cleans up before the next task, as depicted in the following diagram. (And there may still be some statically-allocated memory and “long-lived” memory that is dynamic but used beyond the “turn” of a task).

Overall, this is a much more organized use of memory and potentially more space-efficient.

In a multi-threaded architecture, if two memory-heavy tasks require memory around the same time, neither has to wait for the other to be finished — or to put it another way, malloc looks for available space but not available time for that space. On the other hand, in a super-loop architecture, those same tasks will each get a turn at different times. Each will have much more memory available to them during their turn while having much less impact on other tasks the rest of the time. And the overall memory profile is a bit more predictable and repeatable.

An animated diagram you may have seen before on my blog demonstrates the general philosophy here. A task remains idle until its turn, at which point it takes center stage and can use all the resources it likes, as long as it packs up and cleans up before the next task.

So, what counts as expensive in this new memory model?

It’s quite clear from the earlier diagram:

  • Memory that is only used for one turn is clearly very cheap. Tasks won’t be interrupted during their turn, so they have full access to all the free memory without impacting the rest of the system.
  • Statically-allocated memory is clearly the most expensive: it takes away from the available memory for all other tasks across all turns.
  • Long-lived dynamic allocations — or just allocations that live beyond a single turn — are back to the stochastic model we had with multi-threading. Their cost is the amount of space × the number of turns they occupy the space for. Because these are a bit unpredictable, they also have an additional cost because they add to the overall risk of randomly running out of memory, so these kinds of allocations should be kept as small and short as possible.

Microvium is designed this way

Microvium is built from the ground up on this philosophy — keeping the idle memory usage as small as possible so that other operations get a turn to use that memory afterward, but not worrying as much about short spikes in memory that last only a single turn.

  • The idle memory of a Microvium virtual machine is as low as 34 bytes2.
  • Microvium uses a compacting garbage collector — one that consolidates and defragments all the living allocations into a single contiguous block — and releases any unused space back to the host firmware. The GC itself uses quite a bit of memory3 but it does so only for a very short time and only synchronously.
  • The virtual call-stack and registers are deallocated when control returns from the VM back to the host firmware.
  • Arrays grow their capacity geometrically (they double in size each time) but a GC cycle truncates unused space in arrays when it compacts.

See here for some more details.

Better than super-loop: a Job Queue

The trouble with a super-loop architecture is that it services every single task in each cycle. It’s inefficient and doesn’t scale well as the number of tasks grows4. There’s a better approach — one that JavaScript programmers will be well familiar with: the job queue.

A job queue architecture in firmware is still pretty simple. Your main loop is just something like this:

while (1) {
  if (thereIsAJobInTheQueue) 
    doNextJob();
  else
    goToSleep();
}

When I write bare-metal firmware, often the first thing I do is to bring in a simple job queue like this. If you’re using an RTOS, you might implement it using RTOS queues, but I’ve personally found that the job-queue style of architecture often obviates the need for an RTOS at all.

As JavaScript programmers may also be familiar with, working in a cooperative single-threaded environment has other benefits. You don’t need to think about locking, mutexes, race conditions, and deadlocks. There is less unpredictable behavior and fewer heisenbugs. In a microcontroller environment especially, a single-threaded design also means you also save on the cost of having multiple dedicated call stacks being permanently allocated for different RTOS threads.

Advice for using job queues

JavaScript programmers have been working with a single-threaded job-queue-based environment for decades and are well familiar with the need to keep the jobs short. When running JS in a browser, long jobs means that the page becomes unresponsive, and the same is true in firmware: long jobs make the firmware unresponsive — unable to respond to I/O or service accumulated buffers, etc. In a firmware scenario, you may want to keep all jobs below 1ms or 10ms, depending on what kind of responsiveness you need5.

As a rule of thumb, to keep jobs short, they should almost never block or wait for I/O. For example, if a task needs to power-on an external modem chip, it should not block while the modem to boots up. It should probably schedule another job to handle the powered-on event later, allowing other jobs to run in the meantime.

But in a single-threaded environment, how do we implement long-running tasks without blocking the main thread? Do you need to create complicated state machines? JavaScript programmers will again recognize a solution…

Callback-based async

JavaScript programmers will be quite familiar the pattern of using continuation-passing-style (CPS) to implement long-running operations in a non-blocking way. The essence of CPS is that a long-running operation should accept a callback argument to be called when the operation completes.

The recent addition of closures (nested functions) as a feature in Microvium makes this so much easier. Here is a toy example one might use for sending data to a server in a multi-step process that continues across 3 separate turns of the job queue:

function sendToServer(url, data) {
  modem.powerOn(powerOnCallback);

  function powerOnCallback() {
    modem.connectTo(url, connectedCallback);
  }

  function connectedCallback() {
    modem.send(data);
  } 
}

Here, the data parameter is in scope for the inner connectedCallback function (closure) to access, and the garbage collector will automatically free both the closure and the data when they aren’t needed anymore. A closure like this is much more memory-efficient than having to allocate a whole RTOS thread, and much less complicated than manually fiddling with state machines and memory ownership yourself.

Microvium also supports arrow functions, so you could write this same example more succinctly like this:

function sendToServer(url, data) {
  modem.powerOn( () => 
    modem.connectTo(url, () => 
      modem.send(data)));
}

Each of these 3 stages — powerOn, connectTo and send — happen in a separate job in the queue. Between each job, the VM is idle — it does not consume any stack space6 and the heap is in a compacted state7.

If you’re interested in more detail about the mechanics of how modem.powerOn etc. might be implemented in a non-blocking way, take a look at this gist where I go through this example in more detail.

Conclusion

So, we’ve seen that multi-threading can be a little hazardous when it comes to dynamic memory management because memory usage is unpredictable, and this also leads to inefficiencies because you need to leave a wider margin of error to avoid randomly running out of memory.

We’ve also seen how single-threading can help to alleviate this problem by allowing each operation to consume resources while it has control, as long it cleans up before the next operation. The super-loop architecture is a simple way to achieve this but an event-driven job-queue architecture is more modular and efficient.

And lastly, we saw that the Microvium JavaScript engine for embedded devices is well suited to this kind of design, because its idle memory usage is particularly small and because it facilitates callback-style asynchronous programming. Writing code this way avoids the hassle and complexity of writing state machines in C, of manually keeping track of memory ownership across those states, and the pitfalls and overheads of multithreading.


  1. This simulation with random allocations is not a completely fair representation of how most firmware allocates memory during typical operation, but it shows the consequence of having many memory-consuming operations that can be preempted unpredictably or outside the control of the firmware itself. 

  2. Or 22 bytes on a 16-bit platform 

  3. In the worst case, it doubles the size of heap while it’s collecting 

  4. A super-loop also makes it more challenging to know when to put the device to sleep since the main loop doesn’t necessarily know when there aren’t any tasks that need servicing right now, without some extra work. 

  5. There will still be some tasks that need to be real-time and can’t afford to wait even a few ms in a job queue to be serviced. I’ve personally found that interrupts are sufficient for handling this kind of real-time behavior, but your needs may vary. Mixing a job queue with some real-time RTOS threads may be a way to get the best of both worlds — if you need it. 

  6. Closures are stored on the virtual heap. 

  7. It’s in a compacted state if you run a GC collection cycle after each event, which you would do if you cared a lot about idle memory usage. 

Short-lived memory is cheaper

Short-lived memory is cheaper

TL;DR: RAM on a microcontroller should not just be thought of as space but as space-time: a task that occupies the same memory but for a longer time is more expensive.

MCU memory is expensive

If you’re reading this, I probably don’t need to tell you: RAM on a microcontroller is typically very constrained. A 3 GHz desktop computer might have 16 GB of RAM, while a 3 MHz MCU might have 16 kB of RAM — a thousand times less processing power but a million times smaller RAM. So in some sense, RAM on an MCU may be a thousand times more valuable than on a desktop machine. Regardless of the exact number, I’m sure we can agree that RAM is a very constrained resource on an MCU1. This makes it important to think about the cost of various features, especially in terms of their RAM usage.

Statically-allocated memory

It’s common especially in smaller C firmware to just pre-allocate different pieces of memory to different components of the firmware (rather than using malloc and free). For example, at the global level, we may declare a 256-byte buffer for receiving data on the serial port:

uint8_t rxBuffer[256];

If we have 1kB of RAM on a device for example, maybe there are 4 components that each own 256 B. Or more likely, some features are more memory-hogging than others, but you get the idea: in this model, each component in the code owns a piece of the RAM for all time.

Dividing up RAM over time

It’s of course a waste to have a 256-byte buffer allocated forever if it’s only used occasionally. The use of dynamic memory allocation (malloc and free) can help to resolve this by allowing components to share the same physical memory at different times, requesting it when needed, and releasing it when not needed.

This allows us to reconceptualize memory as a mosaic over time, with different pieces of the program occupying different blocks of space-time (occupying memory space for some time).

When visualizing memory like this, it’s easy to start to feel like the cost of a memory allocation is not just the amount of memory it locks, but also the time that it locks it for (e.g. the area of each rectangle in the above diagram). In this sense, if I allocate 256 bytes for 2 seconds then it costs 512 byte-seconds, which is an equivalent cost to allocating 512 bytes for 1 second or 128 bytes for 4 seconds.

Being a little bit more rigorous

Skip this section if you don’t care about the edge cases. I’m just trying to be more complete here.

This measure of memory cost is of course just one way of looking at it, and breaks down in edge cases. For example, on a 64kB device, a task2 that consumes 1B for 64k-seconds seems relatively noninvasive while a task that consumes 64kB for 1s is much more costly. So the analogy breaks down in cases where the size of the allocations are significant compared to the total memory size.

Another way the model breaks down is if many of the tasks need memory around the same time — e.g. if there is some burst of activity that requires collaboration between many different tasks. The typical implementation of malloc will just fail if there is not memory available right now, as opposed to perhaps blocking the thread until the requested memory becomes available, as if the memory was like a mutex to be acquired.

But the model is accurate if we make these assumptions:

  • The size of the individual allocations is small relative to the total memory size
  • Allocations happen randomly over time and space

Under these assumptions, the total memory usage becomes a stochastic random variable whose expected value is exactly:

The expected allocation size × the expected allocation size × the expected allocation frequency

We could also calculate the probability that the device runs out of memory at any given point (I won’t do the calculations here).

Conclusion

In situations where memory allocations can be approximated as being small and random, the duration of a memory allocation is just as important as its size. Much more care must be taken to optimize memory usage for permanent or long-lived operations.

I’m not very happy with this stochastic viewpoint and all these edge cases. It means that at any point, we could randomly exceed the amount of available memory and the program will just die. Is there a better way to organize memory space-time so we don’t need to worry as much? I believe there is… and I’ll cover that in the next post.


  1. Another thing that makes it a constrained resource is the lack of virtual memory and the ability to page memory in and out of physical RAM, so the RAM size is a hard limit 

  2. When talking about this space-time model, I think it’s easier to talk about a “task” than a “component”, where a task here is some activity that needs to be done by the program over a finite stretch of time, and will consume resources over that time. 

Can you parse this?
JavaScript Corners

Can you parse this?
JavaScript Corners

What does the following JavaScript mean:

const x = await / +y; const z = await / +y;

Hint: it’s a trick question.

The answer depends on the context, as is demonstrated by the following snippet:

function foo() {
  const y = 10;
  const await = 5;
  const x = await / +y; const z = await / +y;
  console.log(x);
}
async function bar() {
  const y = 10;
  const x = await / +y; const z = await / +y;
  console.log(x);
}
foo(); // Prints 0.5
bar(); // Prints / +y; const z = await /10

Within the context of an async function, await is like a keyword, and the thing after await is considered to be an expression. In JavaScript, an expression that starts with forward-slash is a Regexp literal, and that literal ends with the next unescaped forward slash. The +y at the end then represents string concatenation, so both the regular expression and y are converted to strings, and the concatenated result string is "/ +y; const z = await /10".

This interpretation is easier to visualize if the syntax highlighting identifies and colorizes the respective parse tokens as follows:

Outside of the context of an async function, await is just a normal identifier and has no special meaning (this is important so that the introduction of the await syntax to the JavaScript language didn’t modify the meaning of existing JavaScript code which might have used await as a variable or parameter name).

If syntax highlighting was correct, as seen in the above images, the difference would be pretty obvious. Unfortunately, I needed to photo-shop the above images, since VS Code highlights both examples the same, and both incorrect:

Global To-String
JavaScript Corners - Part 9

Global To-String
JavaScript Corners - Part 9

(This is Part 9 in my series on JavaScript corner cases).

Here’s another one.

In JavaScript, global variables are properties of the global object. By default, the global object is like any other, and inherits from the Object.prototype  object. Object.prototype comes with a number of its own properties, such as the toString method. So, that means that toString is also a global variable1.

console.log('toString' in global); // prints true
console.log(toString === global.toString); // prints true
console.log(global.toString()); // prints [object global]
console.log(toString()); // prints [object Undefined]. Why is this?

Everything seems expected, except the last line, which might seem a little confusing. The toString() call is clearly invoking a function using a reference to that function, where the base of the reference is the global object, right? (Take a look at my posts on references). So surely toString() and global.toString() mean the same thing?

Wrong.

There’s a subtlety here. The unqualified toString reference actually has a base value2 that is the global environment, which “knows about” the global object, but is not exactly the global object. The base object for the global environment is actually always the value undefined. See here in the spec. This is why it prints “[object Undefined]” .

 


  1. To qualify as a global variable, there is actually an additional criterion. The property of the global object must not be listed in the set of unscopables on the global object. In this case, toString is not listed as an unscopable, since it was introduced into JavaScript before the existence of the unscopables feature, and for backwards compatibility it remains that way. 

  2. Recall that a reference has two components: the thing being referred on, and the name of the thing being referred to. For example, referring to the property named x on the object obj, in the case of obj.x 

JavaScript Corners – Part 9
Node.js With-Statement Bug

JavaScript Corners – Part 9
Node.js With-Statement Bug

What does the following evil code print?

var x = 'before';
var obj = { x };
with (obj) {
  x = (delete x, 'after');
}
console.log(x);

If you’re not sure, don’t worry — neither are current JavaScript engines. Firefox prints “after”, while Edge, IE, and Node.js print “before” (node v7.9.0). I believe that Firefox is correct in this case.

The tricky statement is obviously the following one, which sets a property on an object in the same statement that deletes the property:

x = (delete x, 'after');

(Side note: if you’re not very familiar with JavaScript, the relevant language features that are being used here are the delete operator, comma operator, and the good ol’ evil with statement).

What we expect to happen

The statement var x introduces a new variable at the script scope1.

The { x } expression creates a new object with a single property2 x, where the value of x  is copied from the variable x in the outer scope, so it has the initial value of ‘before’.

The with  statement brings the properties of the object obj into scope in a new lexical environment.

The statement x = (delete x, ‘after’) should perform the following steps:

  1. Evaluate the left hand side
  2. Evaluate the right hand side
  3. Assign the value from the right hand result, to the reference created when evaluating the left hand side

When the left hand side is evaluated, the property x will be found in object obj. The base value of the reference is the object, not the script variable scope.

The right hand side evaluates to ‘after’, but in the process it deletes the property x  from obj. However, the reference on the left hand side should still refer to “the property named ‘x’ on the object obj“, even though the property with that name is now deleted.

When the assignment happens, it should create a new property named ‘x’ on object obj, with value ‘after’. The variable x in the outer scope should be left unaffected.

In this case, I think Node.js gets the wrong answer.


  1. Theoretically, the script scope is the global scope. But in Node.js, scripts are wrapped in a module wrapper that changes the behavior of global vars. This doesn’t affect the outcome of this experiment though 

  2. Bonus fact. Object literals inherit from the global intrinsic object Object.prototype, which has other properties on it, such as toString. So when I say that it has a single property, it would be more accurate to instead say that it has a single own property 

JavaScript Corners – Part 8
References (Continued)

JavaScript Corners – Part 8
References (Continued)

Given an object o  with a member function f  that prints out what the this value is:

const o = {
  f() {
    console.log(
      this === global ? 'global' :
      this === undefined ? 'undefined':
      this === o ? 'o':
      '-');
  }
}

We know what the following prints:

o.f();  // prints "o"

And we know what the following prints1:

const f = o.f;
f(); // prints "global"

I always thought that the difference came down to the fact that o.f()  is actually invoking a different operator — something like a “member call operator”.

However, what do you think the following prints?

(o.f)();

My guess, up until today, would have been that this prints “global”, since with the parentheses, this is no longer invoking the member call operator, but is instead invoking the call operator.

But I was wrong. There is no such thing as a “member call operator”. Rather, the “call” operator just behaves differently depending on whether the target of the call is a value or a reference2.

So this actually prints “o”.

(o.f)(); // prints "o"

But hang on. Why didn’t the parentheses coerce o.f to a value?

One might have expected the parentheses to automatically dereference o.f, something like the following examples that use the logical OR and comma operators to coerce the target to a value instead of a reference:

(o.f || 0)(); // prints "global"
(0, o.f)(); // prints "global"

Indeed, this could have been the case for bare parentheses as well, but the language designers chose not to do it that way, so that the delete and typeof operators still work when extraneous parentheses are provided:

delete o.f; // The "correct" way to delete a property
delete (o.f); // This also works

 


  1. assuming the use strict directive isn’t provided in this case 

  2. To be more accurate, the target also behaves differently depending on whether the target reference refers to a property of an object vs a variable in an environment record 

JavaScript Corners – Part 7
Calls and With Statements

JavaScript Corners – Part 7
Calls and With Statements

Here’s a quick one. What does the following print? (Assuming not in strict mode)

function foo() {
  console.log(this.name);
}

const bar = { foo, name: 'Bar' };
global.name = 'Global';

foo();         // Case 1
bar.foo();     // Case 2
with (bar) {
  foo();       // Case 3
}

In non-strict mode, the naked function call foo() gets a this value that is the global object. So the first case prints “Global”.

In the second case, we’re invoking foo as a member of bar, and so the this value is bar (it prints “Bar”).

The last case is the most interesting, and the most useless (since with statements are strongly discouraged, and cannot be used outside of non-strict mode). The this object in this case is actually bar. JavaScript recognizes that the function foo here is being invoked within the context of a with statement, and implicitly uses the bar object. This prints “Bar”.

JavaScript Corners – Part 6

JavaScript Corners – Part 6

In what order does the following evaluate?

a()[b()] = c()[d()] = e()[f()];

TL;DR Answer

get a
call a
get b
call b
get c
call c
get d
call d
get e
call e
get f
call f
get e.f
set c.d
set a.b

Step 1: Variable Access

First off, what does this code even mean? If you’re not intimate with JavaScript, this might seem like a very confusing line of code. In fact, even if you’re familiar with JavaScript, this can be confusing.

So let’s break it down, starting with:

a

The expression a loads the value a from the surrounding scope1. This is done by searching up the scope chain until a is found.

There are a number of different types of scopes in JavaScript, including those that refer to blocks (like the inside of a for-loop), functions (the contents of a function), objects (scopes that are created using a with statement, or the global scope).

For our purposes, let’s define a at the global scope. You’ll see why in a moment. Assuming we’re working in Node.js, the global object is called global, and properties of the global object are part of the global scope2.

global.a = 42;
console.log(a); // prints 42

But, since we’re interested in the order of evaluation, it would be useful to know when the value a is accessed. Luckily, in JavaScript, you can define properties that have a getter and/or setter, which we can use to log when the global variable is accessed:

Object.defineProperty(global, 'a', {
  get: function() {
    console.log('get a');
    return 42;
  }
});
console.log(a); // prints "get a" followed by "42"

Great! We can now see when the global variable “a” is accessed. There aren’t many languages where you can do that. Hooray for JavaScript.

We may want to define more globals this way, so lets refactor this to use a helper:

function defineGlobal(name, value) {
  Object.defineProperty(global, name, {
    get: function() {
      console.log(`get ${name}`);
      return value;
    },
    configurable: true
  });
}
defineGlobal('a', 42);
console.log(a); // prints "get a" followed by "42"

Step 2: Calling the function

Now let’s look at the following statement:

a()

This is, unsurprisingly, a function call. It first evaluates a, as indicated above, by fetching a from the current scope. Then it calls a as a function. Nothing special going on here.

But to make this work with our a, we’re going to need to make sure that a is defined as a function, and not the value 42. So let’s change our getter to return a function:

defineGlobal('a', function() {
  console.log('call a');
  return 42;
});
console.log(a());
// get a
// call a
// 42

To answer our original question, we’re going to need to create a whole bunch of functions. So let’s again refactor this into a helper:

function defineFunction(name, body) {
  defineGlobal(name, function() {
    console.log(`call ${name}`);
    return body();
  });
}
defineFunction('a', () => 42);
console.log(a());

Step 3: Member access

The expression x[y], in JavaScript, is a property lookup. It evaluates the expressions x and y, and then finds the property on the object x that has the name resulting from the expression y. Here’s a snippet that illustrates this:

defineGlobal('x', { myProp: 42 });
defineGlobal('y', 'myProp');
console.log(x[y]);
// get x
// get y
// 42

If you’re not very familiar with JavaScript, it’s important to note here that the property name used here is "myProp", and not "y". The property name is the result of evaluating y.

Again, it will be useful to know exactly when the property is accessed, so let’s use a getter instead:

defineGlobal('x', {
  get myProp() {
    console.log('get x.myProp');
    return 42;
  }
});
defineGlobal('y', 'myProp');
console.log(x[y]);
// get x
// get y
// get x.myProp
// 42

Here I’ve just used the ES6 getter syntax, rather than using defineProperty.

As before, we’re going to need to do this a few times, so let’s create a helper function:

function createObject(objectName, propertyName, propertyValue) {
  return {
    get [propertyName]() {
      console.log(`get ${objectName}.${propertyName}`);
      return propertyValue;
    },
    set [propertyName](v) {
      console.log(`set ${objectName}.${propertyName}`);
      propertyValue = v;
    }
  };
}
defineGlobal('x', createObject('x', 'myProp', 42));
defineGlobal('y', 'myProp');
console.log(x[y]);

Step 4: Assignment

The last piece of the puzzle is the assignment operator. Consider the following code:

x = y

The assignment operator, like the other operators so far, will evaluate the each operand, and then perform some operation on the results. In the above case, x is evaluated, and then y is evaluated, and then the result of y is assigned to the result of x.

But wait. What do you mean “the result” of x?

The model here that JavaScript uses internally, is that x actually evaluates to a reference. This is a type in JavaScript which you’ve probably never heard of. A reference value consists of two components:

  • A base value, that tells you what container the value is stored in
  • A name, that tells you which value in the container is being referred to

In this case, the expression x evaluates to a reference that has the following attributes:

  • A base value that is the global object
  • A name that is the string "x"

In other words, the reference value is something like the English description “the property x on the global object”. When you assign to x, you are assigning to “the property x on the global object”. When you delete x, you are deleting “the property x on the global object”.

The expression y also evaluates to a reference, but the assignment operator coerces that reference to the actual referenced value. The same thing is done in expressions such as x + y or x(y).

Here’s another example of an assignment:

x.y = z

In this case, the base value of the reference is the object x, and the name is y.  The assignment sets the value referred to as “the property ‘y’ of the object x”. Similarly, you can do delete x.y to delete “the property ‘y’ of the object x”.

In a more detailed consideration of the above example, x and z evaluate to references. Both x and z are then coerced to values (dereferenced, by fetching the property or variable), and then a third reference is created refers to the property y on the base object x.

But, what order does this occur in? To find out, let’s use our trusty helper functions:

defineGlobal('x', createObject('x', 'y'));
defineGlobal('z', 42);
x.y = z
// get x
// get y
// set x.y

This might come as a little bit of a surprise. The expression x is evaluated before the expression y, and then the assignment takes place. In some ways, one expects the opposite — one expects that the left hand side of an assignment is not considered until the right hand side.

This seems to be a general rule in JavaScript. Operands are evaluated from left to right, and then the operator is executed. Perhaps an exception to this rule-of-thumb, is that the short-circuiting operators such as && must necessarily execute part of the operation without all the operands fully evaluated.

Side note: in languages such as C++, the order of the left and right hand side of a most operators is not defined. The compiler can chose to evaluate them in whatever order it thinks is best, or even evaluate them simultaneously (e.g. if the CPU has multiple cores). JavaScript is different, in that the specification lays out a specific, unambiguous ordering.

We can follow this to its logical conclusion, and determine the order of execution of the whole of the original program in question:

defineFunction('a', () => createObject('a', 'b'));
defineFunction('b', () => 'b');
defineFunction('c', () => createObject('c', 'd'));
defineFunction('d', () => 'd');
defineFunction('e', () => createObject('e', 'f'));
defineFunction('f', () => 'f');

a()[b()] = c()[d()] = e()[f()];
// get a
// call a
// get b
// call b
// get c
// call c
// get d
// call d
// get e
// call e
// get f
// call f
// get e.f
// set c.d
// set a.b

Can we abuse it? (Advanced)

The reason I started looking into this at all, is that I was trying to discover a way to “see” references. They are objects that exist in the execution model, but are never shown explicitly to the user of the language, so do they really need to exist at all?

This is import to me, because I’m writing a JavaScript compiler, and need to know whether references are best left as just a description mechanism in the ECMAScript specification, or if they should be considered to be real entities with real allocated memory in the runtime.

So, can we design an example, that unequivocally proves that there must be a reference allocated in memory at some point?

Here’s my attempt:

let resolveZ;

defineGlobal('z', new Promise(resolve => resolveZ = resolve));

async function asyncAssignment() {
  x[y] = await z;
}

defineGlobal('x', createObject('x1', 'y1'));
defineGlobal('y', 'y1');
asyncAssignment();
console.log('...it should be waiting for the result of z at this point...');
const o = createObject('x2', 'y2');
defineGlobal('x', o);
defineGlobal('y', 'y2');
asyncAssignment();
// Let's switch out property 'y2' for a new one, to make sure it's not holding a
// pointer to the property itself, but is instead recalling it by name
delete o.y2;
Object.defineProperty(o, 'y2', {
  set: value => {
    console.log(`set x.y2 (redefined) to ${value}`);
  }
});
asyncAssignment();
// And lastly, let's delete x from the global scope
delete x;
console.log('...now we are going to resolve the promise for z...');
resolveZ(42);
// get x
// get y
// get z
// ...it should be waiting for the result of z at this point...
// get x
// get y
// get z
// get x
// get y
// get z
// ...now we are going to resolve the promise for z...
// set x1.y1 to 42
// set x.y2 (redefined) to 42
// set x.y2 (redefined) to 42

What I’ve done here is break up the x[y] = z assignment using the await operator. The await operator will suspend the statement (and the rest of the async function), allowing us to swap out various things in the environment to see if we can mess with the operation while it is suspended. What we’re trying to prove here, is that the reference itself must be preserved in memory, from the time that the operation is suspended, to the time that it is resumed (when z is resolved).

To make it even more apparent, I’ve executed the async function multiple times, trying different ways to “mess” with the pending operations.

Conclusions

This experiment has proven to me that references are “almost” tangible objects. We can see that they must exist in memory under some circumstances, and that they are not simple “pointer” values — they must refer to both the object and the property name.

This leads to some interesting results when it comes to the order of evaluation of various expressions. While this knowledge isn’t needed for everyday programming scenarios, it helps to have a deeper understanding of what’s going on so that we know where the limit lies.

 

 

 

 


  1. Known in ECMAScript as a Lexical Environment 

  2. There is an interesting recursion here, since the value global here is also a globally scoped binding, which means the global property on the global object points to itself. You can see this if you have a statement like console.log(global.global.a)