Microvium Async – Part 4
The job queue

Microvium Async – Part 4
The job queue

TL;DR: the job queue in Microvium is used just for executing async continuations. It uses only 2 bytes of idle memory and doesn’t require allocating any threads or special support from the host of any kind.

This is the last of a 4-part series on async-await in the Microvium JavaScript engine:

In the previous posts, I talked about how async-await is useful in embedded systems, how it uses continuation-passing style (CPS) at its core to achieve a tiny memory footprint, and how it creates promises when needed to interface between its native CPS protocol and JavaScript’s standard promise protocol. In this post, I’ll talk about the design choices behind the job queue in Microvium.

What is a job queue?

JavaScript is fundamentally single-threaded1, but job queues allow some level of multi-tasking without multithreading, by allowing work to be broken up into small chunks that get executed sequentially.

If you’re an embedded programmer, you may be familiar with the so-called “super-loop” architecture, where a single main function has an infinite loop and will check various things to see what work needs to be done on each cycle of the loop. Often these checks are hard-coded, but an alternative design that’s quite scalable is to have a queue of work that needs to be done, and the main loop just pull work off the queue and performs it.

In Microvium, the job queue is used solely to execute async continuations (which includes all promise subscribers). If foo is awaiting bar, then when bar completes, a job will be scheduled to continue foo where it left off2.

This behavior is required for Microvium to be ECMAScript spec compliant, but it also has the advantage of using less peak stack memory, since a cascade of continuations will each be executed in a fresh job when the call stack is empty, rather than executing on top of each other like a cascade of traditional callbacks.

The Microvium job queue is invisible to the host

An important factor in the design of Microvium is to make it as easy as possible to integrate into host firmware. One thing I was worried about with the introduction of a job queue is that it would complicate the C integration API, because it could require some way to pump the queue (to execute enqueued jobs). Either this might appear like an explicit API call to Microvium, like mvm_processJobs(), or it might require additional hooks in the porting layer to allow Microvium to hook into the operating system to set up a job thread, or hook into a global super-loop, etc.

None of these options are acceptable to me. I want the “getting started” process to be as seamless as possible for newcomers. Forcing newcomers to do this integration work is just too complicated in my opinion. It would require conveying new concepts like “what is a job queue?” and “when should or shouldn’t I pump the job queue?”, just to get started with “hello world”.

The solution I went with in the end, is that Microvium automatically pumps the job queue at the end of mvm_call. That is, when the host (C code) calls a JavaScript function (using mvm_call), if the JavaScript function creates any new jobs, those jobs will be executed before control returns to the host. The VM waits until the guest (JavaScript code) call-stack is empty, but before returning to the host, and then pumps the job queue until all jobs have been completed.

This approach makes it similar to explicit callback-based asynchrony, where callbacks would be executed inline and therefore before control returns to the host. The difference is that the continuations are executed a little bit later before returning to the host. The amount of time that mvm_call blocks the host for doesn’t change between using async-await and using explicit callbacks.

To put it another way, the program may be sitting idle, waiting for something to happen, such as a timer, I/O event, or user interaction. When something happens, the host might mvm_call the guest to let it know about the event. The way that Microvium is designed, the guest will then fully process the event and reach its next equilibrium state before mvm_call returns to the host.

Job queue structure

How does the job queue look under the hood?

I’m very hesitant to add things to Microvium that use more memory permanently. So the job queue is implemented as a single VM register, which is just a 2-byte slot. It has multiple states:

  1. No jobs: the register is empty.
  2. Single job: the register points to a closure.
  3. Multiple jobs: the register points to a linked list.

Single job

When there’s only a single job enqueued, the jobs register simply points to the job. A job is any function, but in particular it’s almost always going to be a closure since a closure is a function type that can encapsulate some dynamic state (async continuations being a special case of closures). The job queue mechanism doesn’t care what the state is, but since this is a job queue for async continuations, the job closure is almost always going to capture the following state:

  1. The continuation to execute.
  2. Whether the awaited operation passed or failed (isSuccess).
  3. The result or error value (result).

This is all the state required for the job to invoke the continuation with (isSuccess, result), as required by the CPS protocol I defined in the previous post.

Multiple jobs

The vast majority of the time, there will only be zero or one jobs in the job queue. This is because it’s relatively rare for multiple async functions to be awaiting the same promise. But Microvium does need to cater for the case of multiple jobs. When this happens, the jobs register won’t point to a single job, but to a linked list of jobs.

The job pointed to by the register is the first job to execute, but there is an interesting design choice here where the prev pointer of the first node points to the last node in the list (the back of the list), making the linked list into a cycle.

This is simply to facilitate appending to the job queue in O(1) time when adding new jobs, because we can access the last node directly through the first node.

Normally you can append to a doubly-linked list in O(1) time by maintaining a separate pointer to the last node, but I didn’t want to permanently dedicate a whole VM register for the relatively-rare and short-lived case where multiple jobs are waiting. Representing the jobs in a cycle is a neat way to get the best of both.

Note that jobs are removed from the cycle before they’re executed, so the cycle does not imply that jobs are executed multiple times or that we hold onto jobs after they’re executed. It’s merely an optimization that allows Microvium to access both the beginning and end of the queue via a single register in O(1) time.

A quick note about GC overhead. The diagram above with 12 jobs in the queue has 24 memory allocations. All 12 jobs are guaranteed to be executed before control returns to the host3, so they can be considered to be short-lived (allocated and freed within a single call to the VM). In a language like C, doing a malloc for every job would be a lot of overhead, especially when jobs are as small as “run the next piece of this async function”. But in Microvium, this overhead isn’t too bad because of the characteristics of the Microvium garbage-collected heap:

  1. Allocation headers are small — only 2 bytes per allocation.
  2. Unlike with malloc, there is no memory fragmentation because the heap is compacted.
  3. There is no overhead for collecting garbage memory4. Rather, Microvium occurs collection overhead on everything that isn’t garbage. Even hundreds of tiny temporary allocations can be swept away in one fell swoop.
  4. Allocation time is O(1) because the Microvium heap is compacted. Creating new allocations is roughly analogous to just bumping a “free pointer” forward by N bytes5.


The job queue is one of the less complicated parts of the implementation of async/await, but still important. In the end, I’m quite happy with how the design turned out, requiring only 2 bytes of memory (the register itself) when the feature isn’t being used, being lean in the hot path where only one job is enqueued and executed at a time, but still allowing an unlimited number of jobs while keeping O(1) insertion and removal. And this was achieved without impacting the interface between the host and the VM at all, so users of Microvium don’t have any additional integration work.

  1. There are some multi-threaded capabilities in modern JavaScript, but it’s still single threaded at heart and Microvium doesn’t support any multithreading primitives. 

  2. In the previous posts, I may have used the shorthand of saying that the bar “calls” foo’s continuation. But now you know that it’s more accurate to say that bar *schedules* foo’s continuation to be called on the job queue. 

  3. In the case of reentrant calls, the all jobs will be executed before control returns to the host from the outer-most mvm_call. 

  4. Garbage memory does create memory pressure which indirectly has overhead by increasing the chance of a collection cycle. But during a collection cycle, the GC moves all living allocations to a new region and then frees the old region in one go, so the individual dead allocations do not contribute at all to the amount of time it takes to perform a GC collection. 

  5. It’s not quite as efficient as just bumping a free pointer, since it needs to check if the heap actually has enough space, and potentially expand the heap if there isn’t. 

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.