Microvium Boost
It's Magic

Microvium Boost
It's Magic

TL;DR: This Microvium plugin (in development) optimizes Microvium bytecode by statically determining which variables and properties are accessible and how they might be accessed (read vs write), to decide whether to safely store them in ROM or remove them completely from the bytecode.


With the recent alpha release of Microvium, I’ve since turned my attention to a complimentary piece of the puzzle, which I’m calling Microvium Boost (the name may change). Microvium Boost might still be a few months from release, but some important and exciting parts are up and running, and I wanted to share some of the details here for anyone interested.

Microvium Boost is an optimization plugin for Microvium1. It hooks into the Microvium pipeline to optimize a snapshot of bytecode. As a recap, the snapshot is the file that you would download to target MCU; it’s a capture of the full running state of the Microvium virtual machine — see the Concepts page in the Microvium documentation.

Like a typical executable file, a bytecode file contains separate sections for data variables versus constants and functions. I’ve called these sections data and rom2 (these are closely analogous to the .data and .text segments produced by a C compiler). When a virtual machine is restored (loaded) on the target MCU, the data section is copied into RAM so the program can execute it, while the rom section is accessed directly from the bytecode image in ROM during execution (assuming you store the bytecode in ROM).

If I can summarize what Microvium Boost does in a single sentence, it’s this:

Microvium Boost determines the best memory section for each element in the snapshot bytecode, or if it can be completely discarded.

For some things, the best memory section is obvious. For example, function code is immutable and will always be stored in ROM. In the future, frozen objects could also probably be put straight into ROM without any analysis, but Microvium does not currently support freezing.

For all other variables and objects in the snapshot, we can store them in ROM if they are not going to ever be mutated at runtime3. It’s the job of Microvium Boost to determine this ahead of time through static analysis.

A related problem that Microvium Boost solves is whether an element in the snapshot needs to be kept at all. For example, it removes functions that are never called, and objects and variables that are never accessed.

This is a notoriously difficult problem to solve. Take a look at a previous post of mine where I demonstrated that Webpack’s static analysis for tree shaking is quite easy to fool into producing erroneous results.

The best way to show the kind of thing that Microvium Boost can do is by a few examples.

Examples

Moving values to ROM

The following is a simple example4. Here we have a script that creates an object with two properties and then exports a function to the host5 which returns one of those properties.

var obj = { // Stored in ROM by optimization
  x: 5,
  y: 6  // Removed by optimization
};

function run() {
  var temp = obj.y;  // Removed by optimization
  return obj.x;
}

// Export the `run` function to be called by the MCU host
exportValue(0, run); 

In the above case, Microvium Boost determines that object obj is not mutated (not just the variable, but the object itself), and that property y and variable6 temp are not used at all in a particular application.

It’s worth highlighting that var temp = obj.y does not count as a use of property y, since temp is not used. What counts as “usage” is determined lazily, with only those values used in I/O being real usage7, and all other intermediate values only being tentatively needed in case they aid in the computation of I/O.

Built-in Functions and Objects

There is another optimization that Microvium Boost does in the previous example. There are actually a number of built-in library functions that are also culled from the bytecode because they aren’t used (for example, Array.push is not used anywhere here).

This is quite an important feature of Microvium Boost: the ability to have a rich common library of useful functions that any particular script might employ, and have the unused parts of it dropped from the snapshot on a case-by-case basis depending on usage.

Function Parameters

Here’s another interesting example, with the optimization results noted a comments:

var w = 1; // Removed by optimization
var x = 2;
var y = 3;
var z = 4; // Removed by optimization

function run() {
  // Arguments `w` and `z` are removed by optimization
  var result = bar(w, x, y, z); 
  return result;
}

// Parameters `a`, `d` and `e` are removed by optimization
function bar(a, b, c, d, e, f) {
  return b + c + f;
}

exportValue(0, run);

These optimization opportunities might seem obvious when you read the code, but it’s actually quite a difficult problem to solve in the face of multiple possible callers and ambiguity in the function target. To highlight this complexity, consider if we changed run to have the following implementation, which also optimizes just fine with Microvium Boost:

function run(b) {
  var f;

  if (b) f = bar;
  else f = foo; // (Assuming that we define a function `foo`)

  // Arguments `w` and `z` are removed by optimization
  var result = f(w, x, y, z); 
  return result;
}

In this example, the call f(w, x, y, z) might either be calling bar or foo, and it’s impossible to know ahead of time which it is because it depends on the value b passed from the host. If we (the optimizer) remove a parameter from bar, we equally need to remove it from foo so that the parameters for f are consistent. And in general, bar and foo might similarly be called from multiple call sites in the program, so removing a parameter from bar may have implications that affect calls to foo elsewhere.

How does it work?

Microvium Boost is a whole-program optimizer that works differently to any other optimizer I know of. I’ve been working on the concepts behind the algorithm for some years now — they are borrowed from the type inference algorithm for MetalScript.

Naturally, an algorithm like this is beyond what I can explain properly in a blog post, but here is the general idea.

There are two steps to the algorithm:

  1. Determine the dependency relationship between elements of the snapshot (i.e. the program), by stepping sequentially through the source code8. For example, a call operation is related to the corresponding function(s) it calls, such that both have consistent expectations about the parameters to be passed.
  2. Use the relationships to determine consistent facts about the snapshot. For example, to decide whether a particular parameter can be culled from a particular function signature.

The following diagram is the resulting dependency graph for the “Function Parameters” example earlier. I’ve left the labels off the graph just to give a general sense of it9.

In simplified terms, each node in the graph corresponds to an element in the snapshot, such as an instruction, parameter, property, or global variable. The arrows are the inferred directional relationships between these elements. For example, an instruction which reads a global variable is related to the variable that it reads (if the former is not culled, the latter cannot be culled).

Nodes colored solid blue are those determined to be required in the final, optimized snapshot, while the empty circles represent elements that can be culled (much of the culled program is off-screen).

In this particular diagram, as annotated below, the solid-blue root node on the left is the particular IL instruction that returns control to the host: it passes the final value to the host (it is a point of output IO), and so the instruction that generates result cannot be culled from the snapshot. The dependency graph then pulls in a cascade of other operations and parameters that are transitively required in order to produce the single return value, ending on the far right of the graph with the globals variables x and y, which are the ultimate leaf dependencies required to compute the aforementioned output return value to the host. The rest of the nodes are not too important to this discussion at a high level except that they connect the result to the variables x and y from which the result is ultimately derived.

Here’s the corresponding source code again for reference:

var w = 1; // Removed by optimization
var x = 2;
var y = 3;
var z = 4; // Removed by optimization

function run() {
  // Arguments `w` and `z` are removed by optimization
  var result = bar(w, x, y, z); 
  return result;
}

// Parameters `a`, `d` and `e` are removed by optimization
function bar(a, b, c, d, e, f) {
  return b + c + f;
}

exportValue(0, run);

If you changed the source code by removing return result, indeed Microvium Boost will now safely cull global variables x and y, since that whole island of the dependency graph is only anchored by the data required by return operation.

It’s Magic

I subtitled this post “It’s Magic” because of Microvium Boost’s ability to deal with dynamic information, which seems almost impossible to me, despite the fact that I wrote the code for it!

We could make the example a bit more dynamic to illustrate this by making the call to bar indirect, as in the following code:

var w = 1;
var x = 2;
var y = 3;
var z = 4;

function run() {
  var result = call(bar, w, x, y, z);
  return result;
}

function call(func, arg1, arg2, arg3, arg4) {
  return func(arg1, arg2, arg3, arg4);
}

function bar(a, b, c, d, e, f) {
  return b + c + f;
}

exportValue(0, run);

The inferred dependency graph for this looks roughly the same, with an extra layer of nodes between the root return operation and the leaf global variables from which the returned information ultimately feeds. And just like before, it determines that w and z are not used, along with the corresponding arguments to call and bar.

Why Microvium Boost?

This kind of analysis is expensive to compute, but I think it’s well suited to the kind of scenario that Microvium is targeting, where the machine performing the optimization is thousands of times more powerful than the microcontroller device on which the optimized bytecode will run; where every byte and every instruction counts.

I think Microvium Boost enables a style of script programming that was previously infeasible for these kinds of situations.


  1. Microvium Boost is not part of the open-source Microvium codebase 

  2. There is actually a third section in Microvium which you don’t see in a C compiler, which is called gc and is the initial state of the heap in the same way that data is the initial state of the global variables. The principles that apply to data extend to gc 

  3. I believe the XS engine from Moddable takes a different approach, which is to store everything in ROM *until* it’s mutated at runtime, but this requires extra runtime overhead and indirection, and I didn’t choose this approach with Microvium 

  4. All these are working examples, mostly copied from the Microvium Boost automated test cases 

  5. The host of a typical Microvum program will be the surrounding C firmware on which the VM runs. 

  6. My choice of var over let or const for these examples is for readers who aren’t familiar with JavaScript, and may mistake the meaning of const or be confused by the name let

  7. Values used to discriminate control paths are counted as a special case of I/O, since most control paths eventually lead back to the host. 

  8. Actually, by stepping through the instructions in the intermediate language (IL)  

  9. Just ask me if you want to see the full, labeled graph. 

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.