# JavaScript Corners – Part 5

Here’s a quick one:

This was quite unexpected to me. It’s the only time I’ve ever seen where the left-hand side of an assignment can affect the right-hand side.

This only happens once. Once the anonymous function has a name, it can’t be re-named:

Interestingly, this doesn’t seem to work with destructuring:

This implies that somewhere between when the anonymous function is instantiated, and when it added to the array literal, the anonymous function acquires a name. It’s not the destructuring itself that suppresses the name, as can be seen in the following example:

It also doesn’t seem to work with anonymous functions passed as parameters:

It also doesn’t seem to work with anonymous functions evaluated from more complicated expressions:

# JavaScript Corners – Part 4

Here’s another quick one. It relates to the scoping rules with parameter expressions.

Function foo has three parameters: x, f, and y. The key thing here is that the default value for f is a function that encloses the local lexical scope surrounding the function, and we’ve designed the function in such a way that when we call it it “tells” us the values of x, y, and z in the surrounding scope.

So the question is, which values of x, y, and z does f close over? There are global variables x, y, and z, there are parameters x and y, and there are local variables x, y and z. The parameter x is intentionally ordered before the function f is initialized, while parameter y is positioned to be after the function f is initialized.

I’ll spare you the time of running the code yourself. On my machine, in node.js 7.0.0, the output is [ 1, 2, 'c']. Clearly no local variables have been used, and f is capable of seeing parameter y even though it’s declared after f. The output is the same whether or not the code is executed in strict mode.

One of the most interesting things here, to me, is that the local variable x is not an alias for the parameter x.  This is strange when you consider the following code:

Given that local variables are apparently not aliases of the parameters with the same name, you would expect the output of the above to be undefined, since variable x is not initialized, even though parameter x is initializedHowever, the output is actually 42.

Where did we go wrong in our logic? Let’s see if we can get an example that explores what’s going on:

The output of this is:

This clearly tells me that variable x is not an alias of parameter x, but rather a separate variable that is initialized to the same value as parameter x.

Also, as a side note, it tells me that there is a subtle semantic distinction between initializing a variable to undefined and not initializing a variable at all, which is very interesting.

I should note that we’ve explored this empirically, not deducing the behavior from the spec. When I run the same experiment in Firefox I get a different result — both instances of y are undefined.

Which is correct?

I believe the V8 (node.js) implementation is more accurate in this case. There’s a note in the ECMAScript specification1 that says:

NOTE A separate Environment Record is needed to ensure that closures created by expressions in the formal parameter list do not have visibility of declarations in the function body.

I think this note is self explanatory, and tell us that the closure function in the parameter list is not meant to be able to see the local variables of its parent function.

P.S.

I’m going add one last investigation to this post. The above quote applies only if there are parameter expressions (if some parameters have default values). If there are no parameter expressions, then apparently this second “environment” is not created, and so the variable x should be an alias for the parameter x.

At first I thought that there was no way that we could ever test this distinction. Can you think of a way to test this difference?

Spoiler alert, I did think of a way. If the code is not executing in strict mode, then the arguments variable holds aliases for the parameters, which makes it possible to mutate parameters without referring to them by name. This is important, because once you have a local variable with the same name, there is no way to know if the local variable is a copy of the parameter, or an alias of the parameter, based purely on code that identifies it by name.

Take a look at the following experiment:

First note that y is not used in this code sample, but you’ll see why I added it in a moment.

The first log output of “1” shows that the variable x is either a copy of parameter x or an alias of parameter x. Then we continue by modifying the parameter x, without touching the variable x. Then we log the value of the variable x and see that it’s also changed — because in fact the variable and parameter x are both symbols for the same binding.

But now look at this slight modification to the code:

Note that again y is not used. In fact, to keep this a controlled experiment, y is even initialized to the same value that is passed as a parameter. The contents of the value initializer is not relevant though, because the initializer is never evaluated. What is important in this case is that there is an initializer expression at all, regardless of whether it executes or not.

The surprising thing, as I noted in the comment on the snippet, is that the second output is now 1 instead of 2, resulting from the fact that the mutation of the argument does not change the variable.

This is not a bug in node.js, but just an interesting quirk of the spec, in another corner of JavaScript.

# JavaScript Corners – Part 3

I was busy reading ES spec today, and the following question popped into my head: at what point are default parameters evaluated?

To answer this question, I wrote the following script:

If default parameters are evaluated when the function is declared, then we would expect 'a' to appear once in the terminal output.

On the other hand, if the default parameter expressions are evaluated when the function is called, then we would expect 'a' to appear two or three times, corresponding to each time the function is called, and whether the default parameter expression is evaluated even when an argument is provided.

What do you think the output is? I won’t show you here… you can run the snippet yourself in a REPL to see if you’re correct or not.

As I did last time, I’ll leave you with this  mind-bending challenge: what does the following do?

# JavaScript Corners – Part 2

I’m continuing my series on JavaScript corner cases.

Last time I looked how function names interact with each other and with variable names. Today I’m looking at how parameters interact with variables.

Take a look at the following code snippet:

It creates a function that takes a parameter named x, and then calls that function with an argument ((You may have heard of parameters being called “formal parameters”, and arguments being called “actual parameters”, but I think this originates largely from people who didn’t understand the distinction and so confused things for everyone else)) with a value of 1.  The curve ball here is that we’ve then also declared a variable with exactly the same name as a parameter. Does the symbol x then refer to the parameter or the variable? Do variables shadow parameters?

The variable hasn’t been given a value, so it must be undefined, right? If x refers to the variable, then the console output will be “undefined“, while if x refers to the parameter, then the console output will be “1“. So which is it…

On my computer, using node, this outputs 1. So it seems x refers to the parameter, and not the variable… right?

But now consider the following code snippet:

This creates a variable, and then gives it the value 2. According to our theory thus far in this post, the console will still output the value 1, since the x refers to the parameter and not the variable, as we saw from the previous snippet.

But if you run this code, it actually outputs 2. So it seems that it’s not that variables or a parameters “win” the battle to get into the namespace. Rather, it seems that parameters actually are themselves variables, and in the same namespace. The parameter x and the variable x are actually one and the same.

I’ll leave you with this bonus challenge: what does the following do?

# Why Types Introduce Coupling

Last time I gave a long post about why C is complicated and JavaScript can make your code simpler and less buggy. Today I want to touch on another idea: that statically typed languages introduce a level of coupling in your program that’s unnecessary and makes maintenance harder.

Imagine that I ask you to write a function to simply iterate through a list of integers and output them to the console. In C, you might write the following code:

But wait a second. I didn’t tell you that the list was going to be presented a contiguous array! What if it was a linked list? Then the code might look like the following, assuming that we had previously declared the types LinkedList and LinkedListNode appropriately:

The problem in both of these is that the function we’ve written is intimately coupled to the type of the parameters it accepts. There’s no way in C to express the idea of “I want my function to accept any list-like type”. This means that we cannot reuse the above code in other circumstances where the type is even slightly different.

## C++

In C++, we can address this problem to some extent by using templates. We could write the above function something like this in C++:

There are still some of the same issues in this code. For example, do we pass an iterator pair to the function (as many STL functions do), or do we pass the container itself?

To me this is a non-solution. For one thing, it’s saying that if you want to write reusable code that’s decoupled from the types of its parameters, you need to write everything in your program as templates. This is clearly not an option for many reasons, including compilation time, error messages, complexity, etc.

All that C++ templates give us is a taste of a world without explicit types. The above C++ function is decoupled from the caller simply because it does not state what type it expects.

It’s true that there are languages that perform type inference, and allow you to write pretty generic code without losing static type safety, but as with C++, the error messages and language complexity associated with this type of coding makes it difficult for beginner-to-intermediate level programmers.

## C#

At first glance, you may thing that higher level statically-typed languages solve this with better abstraction mechanisms. Consider the C# code for the above function:

The above function says that it accepts any argument that is enumerable, meaning anything that can be iterated. This is fantastic from a reusability and code maintenance perspective, so it does address one side of the issue. But there is another issue lurking behind the scenes. One that may be more noticeable to people working in very resource-constrained environments like in embedded firmware.

Although the above C# code does not couple itself to the specific type of integer list, and so can be used throughout the program, it does actually still have a level of coupling at the binary level: the function accepts something that has a binary signature of IEnumerable<int> . It is not the same as a C++ template that is decoupled by having the template instantiated for multiple binary signatures.

This machine-level coupling is invisible to the programmer, which is convenient and important when it comes to managing large and complex codebases. But remember that interfaces as a language feature are normally implemented at the binary level by passing control through several levels of indirection. I won’t go into detail – you can look it up yourself if you’re interested – but here is a brief summary to illustrate my point.

The machine code must first look up a virtual dispatch table from the interface pointer, and then look up the correct entry in the dispatch table corresponding to the method you want to call, and then perform an indirect call through that entry in the dispatch table. The indirect call doesn’t go straight to the function implementation, but first through a thunk that adjusts the this  pointer from the interface’s this  pointer to the underlying object’s this  pointer (which are different because objects can implement multiple interfaces), and then the thunk redirects to the actual function. This is to perform a single function call on an abstract interface.

Although the C# language doesn’t dictate these details, you have to be aware of them anyway. The logical types are still coupled to the binary types, and so the choices you make about how generic you want your code really do still have an effect on the performance and overheads involved in your program.

## JavaScript

So what’s different about JavaScript? Simply put, JavaScript doesn’t require you to specify these type details at all.  This not only brings a level of decoupling to the logical program, where you gain benefits of reusability and maintainability, but also unshackles the compiler to theoretically be able to apply a whole new range of optimizations. As compilers are getting better and better, I think this will be more and more the case, and soon the performance of a program will be more related to how few details you dictate, rather than how many, and the performance of JavaScript programs will exceed those of the C and C++.

# Why JavaScript

People who have known me for a while, and long-time readers of this blog, will know that I never used to be a fan dynamically typed languages. The static type system available in C# and C++ can help to prevent many bugs and runtime errors, and allow the programmer to add extra clarity to the code.

But in recent years, I’ve come to acquire a taste for JavaScript. So much so that I’m actually on a mission to write a JavaScript compiler for embedded firmware, so that I can bring the bliss of JavaScript to the nightmare of embedded programming.

For those who don’t already know, JavaScript is a completely different language to Java. Please, do not use the word Java to refer to JavaScript, or vice-versa. They are as unrelated to each other as C++ is to Java. I know this is stated many times, but I’m expecting that some of the readers of this post are coming from a purely C/C++ background, and might not be aware of this fact.

Many people have written about the pros and cons of JavaScript, so what I’m saying here is not unique, but I’m going to approach it from the perspective of an embedded firmware programmer, who typically works in C or C++ on a daily basis. Some of what I say will also apply to those of you in other arenas.

## Familiarity

One thing that JavaScript has going for it, is that the syntax and structure will be familiar to most embedded programmers (and many other programmers). It uses curly-braces to denote blocks of code. Function calls are done using parentheses with comma-separated arguments, as in foo(1,2,3). Most of the C/C++ operators work as expected, such as the logical operators &amp;&amp; and ||, and the bitwise operators &amp; and |.

These may seem like obvious things, if you’re from a background of C/C++, C#, or Java, or some related language. But bear in mind that not all languages are like this, and so the familiarity of JavaScript eases the learning curve associated when coming to JavaScript from languages like C/C++. Compare these two snippets of code, one in C, and the other in JavaScript, that implement the Fizz-Buzz challenge:

If you’re a C programmer, then hopefully the above similarity appeals to you.

## Simplicity

Perhaps the number one thing I like about JavaScript is simplicity. With JavaScript you can often write very clean and readable code, because you’re focusing on the algorithm and behavior, rather than adding bloat associated with types and ABI’s and memory concerns.

In C++, you’re writing code to appease three different rulers: the type system, the machine, and the reader/maintainer. Most C++ code seems to be a compromise between these, but in JavaScript your code can focus on the reader/maintainer, and let the compiler worry about the machine. You worry more about clearly describing, in code, the behavior of the program, rather than worrying about how the compiler/interpreter will implement that behavior in terms of instructions and memory access, or worrying about constructing elaborate types to describe the meta-rules about what the program is allowed to do.

As a thumb-suck estimate, I probably spend at least half my time in C++ trying to appease the static type checker. And I would say that 90% of that work is spent on false-positives – cases where the type checker notes an inconsistency, but where the same code code in a dynamically typed language would not have had a bug in it. Inconsistencies in the type system do not always equate to real program inconsistencies.

At the risk of making this post far too long, I’m going to give you an example, albeit somewhat contrived. Let’s say that we have a binary tree structure: a node in the tree is either a leaf-node with a value, or an internal node with no value but with left and right subnodes/children. Now let’s say that we want a function that flattens the tree, returning an ordered sequence of only the leaf values.

### In JavaScript

In JavaScript, I can imagine a function that looks like this:

It uses a stack to iterate the tree. I could have made a much more succinct solution using generators and recursion, but I’m appealing to those of you who are coming from strictly imperative, procedural programming backgrounds, so that’s why I chose this approach.

### In C

Let’s write the equivalent code in C.

Firstly, a node can be either a single value, or branches into two subnodes. How do we represent this?

Here is one way. A node can be either a leaf value or an internal node, and to tell the difference, we probably need some kind of flag (or tag). It also needs something to store the contents of the node, which is a union between the two different options:

Side note: the style of the above code might not be what you’re familiar with. Or or it might be. I don’t know because C (and C++) doesn’t come with a standard style, which is one of the ways in which I think JavaScript is better: there is a single generally-accepted style to writing JavaScript code.

The contents of an internal node is that it has a left and right sub-node:

Hmm. There’s another question here that’s unanswered. Who owns these nodes? Does the tree own them? Could there be multiple trees that share the same nodes? Is it even the same between one tree and the next, or within all the nodes of a tree? It’s not specified here, but it’s yet another “bloaty” detail to figure out.

We said that a leaf node is “a value”. But what value exactly? Is it an integer? Another struct? Is the value type the same for all nodes in a tree? Should the memory for the value embedded into the node, or should the node point to the value? If it points to the value, then who owns it? If it’s embedded, then how big is it, and are there any rules we have to follow when copying it or moving it to different locations in memory (is it pointed to by anything else, or has ownership of anything else)? So many questions. So many details that aren’t relevant to the problem at hand.

One way here is just to say that a leaf node has a pointer to the value, and that we don’t know anything further about what type it is:

I could save on some of the bloat by using an anonymous union, but I’d argue that not a whole is gained in terms of simplifying real complexity.

How much time have we wasted so far? Let’s recap what we’ve done:

• We’ve defined one possible implementation of a tree node
• We’ve coupled the implementation to issues of memory layout, such as whether nodes point to their children and values or have them embedded in the same memory.
• We’ve opened many cans of worms regarding ownership, style, the type of contents, etc.
• We haven’t even started writing the actual function yet.

Now for the actual function (oh dear, this is going to be a long post). We need a way to pass the tree to the function, and a way to retrieve the list back. We’ve already defined a type for the tree, there are other questions to be answered when it comes to passing it to the function:

• Should the root node be passed by value, or by pointer?
• Is the function be expected to mutate, or free the tree passed to it? For example, can the output list reuse the memory from the input tree?
• Should the tree be passed using a shared global variable, or an actual parameter?1

There are similar questions about getting the output from the function, with all the same concerns that we’ve already discussed about representing trees. Should the result be a contiguous array or a linked list? Should it be allocated by the caller or callee? Should it be a shared global variable? And any number of other considerations.

I’m going to try my hand at a direct implementation in C, trying to sidestep issues of memory allocation by having everything passed to the function itself:

This code makes me cringe.

It doesn’t really match the spec, because it has a hard limit on how deep a tree can be. To get around that we would need some dynamic memory allocation, which would add a whole lot more bloat and complexity. This code also requires that the caller have some idea of the maximum size of the resulting list, which may or may not be easy to know.

The complexity is ridiculous. How many times do you need to look at *list++ = node->leafContents  before you can be sure that you’re incrementing the pointer to a list of pointers, and not incrementing the pointer in the list. Maybe we need to add some more bloat to encapsulate these: more functions for managing the list so we only have to write that kind of code once. Don’t even get me started!

So let’s see how well our beloved type system did. I’m going to compile the above code and see what errors come up.

Here are the list of mistakes I made. I’m categorizing each as either true-positive (the compile error saved my skin), false-positive (using a dynamic type system I would not have had a runtime  bug), or false-negative (I found a bug by looking actually the compiler didn’t catch it).

• I forgot to #include <stdbool.h>  … arguably a false-positive, since stdbool is only needed if you have static types.
• I forgot to include stdio.h. True-positive: I forgot to include a module that was relevant to the program behavior.
• In creating the example tree,  the line tree[0].internalNodeContents.left = tree[1] , I was missing an & sign. I’m going to say this is also a false-positive. I was assigning one value to another, and the fact that one value is typed by pointer and the other by value is not a concern related to the algorithm or code logic.
• To pop a value off the stack, I used stack[stackSize--] instead of stack[--stackSize]. This is a false-negative. The type system did bugger-all to protect me from accessing invalid memory. Luckily the hardware caught it and issued a seg-fault, but on embedded hardware you aren’t always so lucky! What’s more is that code that caused the issue is unrelated to the algorithm that the function was supposed to be implementing. In a sense, it’s the implementation of a completely different algorithm (the algorithm for pop stacks). So the bug in the code was not just not-noticed by the C compiler, but it was in a real sense caused by the limitations of the C language.
• In printf("%d", list[i]), I was logically printing an integer to the console, since the list is a list of integers, but actually the integers are physically stored as references (pointers), so it should have been printf("%d", *((int*)list[i])). Pretty, ain’t it? This is a false-negative. There was a bug, but the type checker failed to find it. Instead it just printed out crap to the console. On GCC with the default settings2, there was no warning about this.
• I’m not returning a “success” code, or checking the return code when the function is called. This caused no error in this case, but might cause strange behavior if there was something that did check the result error code, or a case where the error code was necessary (a failure). I’d call this a true-negative in this particular case. The function acts unexpectedly, but doesn’t explicitly say otherwise so actually there’s no spec that it’s defying. What’s more is that it doesn’t introduce a bug into this particular program.

So how does that compare with JavaScript?

Well, what happened when I ran the JavaScript program? Exactly what I expected to happen. It ran. It output the correct result. No errors. No bugs.

This is not because I’m an expert in JavaScript. I have many more years’ experience in C than JavaScript. It’s because simple, bloat-free code is easy to reason about, and so less likely to contain bugs.

Conclusion: please use JavaScript instead of C. You will have fewer bugs because your code is simpler. It will also cost less to develop because there are fewer concerns to worry about, and it will be easier to maintain because the code is clear and easy to understand.

### In C++

I’m not going to implement the above in C++, but instead I’m going to say, in a hand-wavy way, that I don’t think it’s much better. In C++, you could write something that looks similar to the JavaScript version, using a stack from the STL to implement the stack variable. But the problem with this is similar to the problem with C: the implementations are coupled to the machine in a way that means when you bring in your favourite container, you’re forcing the compiler’s hand when it comes to implementing the code in terms of machine instructions and memory. The result is essentially bloat in a different kind of way. It get’s messy, and to make a solution that is as generic as the JavaScript one would require a ton of code, and with it a ton of bugs.

That’s all I’m going to say for the moment. If you come from the land of C++ and want to hear my opinion in more detail, leave a comment or send me an email, and perhaps I’ll make another post about it. This one is well long enough that I should be moving on to my last point.

## Safety

The above C example leads me to another great thing about JavaScript: safety.  What I mean by safety (or lack thereof) is:

• How easy is it to introduce bugs?
• How bad are the bugs, and how difficult are they to fix?

C is awful in this respect. The simple, real, bug in the above code where I dereferenced a pointer that wasn’t assigned, leaves the program open to the most hideous kinds of erroneous behaviors – those that are non-deterministic, and can affect anything in the program. Once you’ve crossed the line of undefined behavior, things in completely unrelated parts of your program can start failing for no apparent reason, not matter how well you wrote them. This is not just a door for bugs, but also for malicious attackers.

In JavaScript , there is no such thing as undefined behavior. There are a few things that are implementation-defined, meaning that different JavaScript engines will execute them differently, but there is nothing like C or C++’s undefined behavior, where a line of code can have literally any effect, including damaging the data or even functions in unrelated parts of the program. When you want behavior to be well-defined, use JavaScript instead of C/C++.

JavaScript is also a very good abstraction. Programs execute in isolation from the rest of the system, which is great if you have safety-critical or security-critical applications which need to guarantee some sort of behavior.

## Conclusion

I could go on and on about the benefits of JavaScript, and perhaps I will in later episodes, but for the moment I hope that in this extraordinarily long post I’ve convinced you that there is some hope to JavaScript, even to the point of using it in embedded firmware development.

1. Most people would say using a parameter is preferable, but as I’ve said before: in C you’re appeasing multiple gods. The choice of whether to use a global variable or a parameter is not just about what is easier to reason about or better for code reuse, it’s also about the function ABI and the machine instructions generated.

2. and the -std=c99 flag

# JavaScript Corners – Part 1

Recently I’ve been trying to write a simple JavaScript compiler, and it’s lead me to think more deeply about some JavaScript behavior that I previously would not have thought about, and I’d like to share that with my JavaScript readers.

Take a look at the following code JavaScript code, and try to figure out what it outputs to the console (I’ll give you a hint: it doesn’t output any errors):

The function code first calls foo – but which foo?

Perhaps it executes the foo 1. After all, there can only be one function called foo, and so the others may not bind correctly after the first one is declared, so the first remains the “real one”.

On the other hand, perhaps it’s not the first definition of foo that “wins”, but the last one. So could it be foo 5?

But the last definition of foo a variable named “foo”, not a function. So which wins when it comes to binding a symbol: variables or functions? Or are they treated equally?

If variable are somehow considered “second prize” to functions when it comes to finding which value matches which name, then it wouldn’t be foo 5 that wins, but rather foo 4, since foo 4 actually  has a function named foo, whereas foo 5 is only a variable named foo that holds an anonymous function.

But foo 4 is not declared at the outer scope of the function. Could it be that declarations at the outer scope win against declarations that are declared in some kind of nested scope? Perhaps then the answer might then be foo 2, since foo 2 is the last function that is actually named foo and declared in the outer block?

The only one we’ve completely ruled out, that it can’t possibly be, is foo 3. The function foo 3 is not the first foo, and not the last. It is not declared in the outer block (and if non-outer-block declarations could win then foo 4 would win). It is also nested inside a block that is only executed on a condition, and the condition is always false so the block is clearly never executed, and thus the function is never declared anyway.

So it can’t be foo 3, but it could be any of the others. Which do you think is it?

You may have guessed it based on my harpings-on. The answer, when I execute the script using node, is indeed foo 3.

In JavaScript, the nested scope is not really a nested scope. All local var and function declarations are at the function scope of the whole parent function. If you’ve been working in JavaScript for a while, you probably already know this. The fact that it’s in a “false” condition block has nothing to do with it, since function declarations are not “statements” to be executed in the sequence of the program (and if they were, then this whole snippet would fail).

What was interesting to me is that it must be the case when function and variable names clash, the functions seem to always win.

The foo 4 function is a little misleading. Just because the function has a name, doesn’t mean that it’s attached to the function scope, because in the case of foo 4 the function is a function expression. These are not even part of the function’s namespace, as you can see if you execute the following snippet, and see that it gives you an error:

Does "use strict" cure this strange behavior, and somehow give a parse failure? On my machine when I change the function to have "use strict", it doesn’t give any errors, but it does change the output to foo 2 instead of foo 3. I found this quite unexpected.

This is all interesting behavior. These things don’t affect everyday JavaScript much, but they are fun and interesting corner cases to consider, and help us understand the language more deeply.

Let me know what you think about this in the comments. Hopefully I’ll continue this series with more of the interesting corner cases that I find along the way.

# Continuations in C

There are times when you need to call a function, but you want to say “call me back when you’re done” rather than blocking the current thread.

An example might be when you’re reading a file. If you imagine for a moment that every CPU cycle is on the scale of 1 second, then disk access is in the order of days to months (take a look at the coding horror post about it). When you call a simple C function like fread, you could be blocking the current thread for millions of CPU cycles. You don’t want to be blocking the thread, because threads are a valuable resource and multithreading is a difficult skill.

## The Typical Solution

The typical way to solve this in C is to use a callback function. I’m not going to explain callback functions here, since there’s an abundance of information about them on the internet. Instead I would like to point out a convenient pattern of how to store state for the callback function.

Let’s use a concrete example. Say we have some function bar, which is expected to take a long time to execute, and a function foo which needs to call bar. The synchronous way of writing the code (non-callback way) might look like this:

The task finishes by returning some result of the long process. For the purposes of this example, we’ll say that the result is 42.

If we convert it to the asynchronous form (the callback form) it might look like this:

Note that normally bar would not call the callback itself, but instead save the callback to be called later. I’ve only called it directly from bar as a convenience in the example.

## The Problem

I’ve seen this pattern many times. But it’s flawed in a major way: if foo has some state that must be persisted across the call to bar, how does it get that state to the continue_foo function? For example, if I declare a variable in foo, how do I access the variable in continue_foo? Typically what I see is that people will simply use global variables. But that’s an awful solution for many reasons1.

## Slightly Better

A better pattern, which I’ve used myself quite often, is to for foo to tell bar, “please hold the value of XYZ for me, and when you call me back, please give XYZ back to me to remind me why I called you in the first place and help me remember where I left off”. It might look like this:

A few quick points I’d like to draw your attention to:

• Bar only sees the type void*, and not something more specific like Foo_state, because obviously bar may be called by other functions as well, not just foo
• Rather than allocating foo’s state on the heap, foo just accepts the state as a parameter, leaving it up to the caller to decide where it must be allocated. This parameter is only to say where the state should be stored, and is not expected to have any values populated by foo’s caller.

Let me emphasize that last point again: there is no heap allocation involved in this example. The state could very easily be statically allocated, or pool-allocated, or even stack allocated2. Especially, consider that foo’s caller is likely to face the same problems foo has faced with state management, and so might already have it’s own state structure which would provide the perfect home for foo’s state structure without incurring an additional heap allocation.

## The Best Solution

But we can do even better. The problem with the above example is that we’re passing two things around: the callback function pointer, and the callback state pointer3. Let’s take a look at a way of doing this while only passing one pointer:

I’ll draw your attention to the differences:

• Foo_state now contains a field called call which holds the callback function pointer. It’s important that this field is the first field in the structure so that a pointer to this field is also a pointer to the whole structure.
• The callback function signature still accepts the state as a parameter, as before.
• The call to bar no longer takes two parameters but now only takes a pointer to the callback function pointer (note the double-pointer)
• When bar needs to call the callback function, it needs to dereference it first. It also needs to pass the callback state. But since, by design, we’ve said that a pointer to the callback function [pointer] is also a pointer to the callback state, we can simply pass that pointer as the argument. This gives us the interesting syntax (*callback)(callback, result). In a sense, this is saying “call the callback, and tell it which callback we called”.

Those who are familiar with how object-orientated programming works under the hood may recognize this pattern. Typically objects are laid out in memory such that the first field in the object state is a pointer to the class vtable. When you call a virtual member function on the object, the pointer-to-the-object is treated as a pointer-to-the-vtable-pointer and is used to resolve the virtual dispatch. In our example above there is actually less complexity and overhead, since we don’t need a whole vtable but can point directly to the function.

I love this pattern because it’s really clean and quite simple. The whole callback, including the function and the state, is neatly represented by a single pointer4.

The callback pointer can be called using a very self-contained syntax. That is, it only depends on one variable, not two. This is actually not just a matter of syntax: a single variable means better use of CPU registers, and fewer accesses to memory.

The fact that the callback is represented by one small value also makes it easier to manage. There’s much less risk of calling the callback with the wrong state. It’s also lighter to pass around.

The most obvious disadvantage to me is that it uncommon. Someone looking at the code for the first time won’t just understand what’s happening straight off the bat. It also means that there’s no language support for it. C++ is in some ways an extension to C with language support for first-class objects. But there is no common language that is an extension to C with support for this kind of first-class-function (with state).

The performance of using this pattern isn’t a disadvantage in my opinion. If you’re comparing it to the performance of a “naked” function pointer, then yes, you may incur some overhead from passing the additional state argument and from double-dereferencing the function pointer. But consider that this type of function call should actually be faster than calling a virtual function in a most object orientated languages (which has a triple-dereference), because there’s no vtable lookup. And virtual function calls are in turn typically faster than interface function calls (and correspondingly virtual functions with multiple inheritance, depending on the optimizer and conditions).

I’d also like to dispel another disadvantage, not directly related to the pattern but more about using callbacks in general. At first glance it seems that there is a lot of overhead in accessing persistent variables in the state structure, because instead of saying “x” you have to say “state->x”, which implies an extra pointer deference and possibly some pointer arithmetic. But think about this: how are variables normally accessed anyway? Variables are normally stored in the stack frame, which is essentially a structure pointed to by the stack-pointer. Yes, there may be less register elevation which would affect the performance, but I think it may be less of a problem than you’d expect.

Likewise, at first glance it seems that there is extra space used to store the callback function pointer. But in reality, a stack frame also stores the “callback” function pointer anyway: we just normally refer to it as the “return address”. An important point to note in the last example, is that the very last thing foo does is call bar. This is what’s called a tail call, and it means that any half-decent optimizer will re-use foo‘s stack frame space for bar. To put it another way: while bar is active, foo doesn’t use any stack space, but it does use space in the persistent state structure (wherever that may be), and the persistent state structure has many of the same attributes as the stack frame would have had, including a pointer into code space. From this perspective, there is no extra space required to store the callback address in the state structure.

The only thing missing is hardware support. A “normal” call has hardware support for automatically populating the return address into the state structure (aka stack frame) and saving register states etc (aka saving persistent variables). And a “normal” return has built-in support for dereferencing the stack pointer to obtain the return address pointer (note the double-pointer again) and jumping to that address, all in one step. But I imagine that if this pattern became more common in usage (probably with language support), hardware support would probably follow.

Until then, I still think it’s a great pattern to use in C, and we should all add it to our toolbox of C patterns.

1. Please ask me – I’ll be happy to tell you all the reasons why it’s so horrible

2. In the less likely scenario that the caller decided to manually block the thread using thread synchronization techniques.

3. On most modern architectures this would just mean that it takes twice the space, since there are two pointers involved. But C doesn’t require function pointers to be the same size as data pointers. One embedded architecture I work with has function pointers that are twice the size of normal heap pointers – after all, RAM is more expensive per bit than ROM

4. A RAM pointer, which in some cases is smaller than a function pointer, giving it yet another advantage over the typical callback

# Moonwalk debugging

How many times have you run into this situation:

You’ve just finished a large section of code. You launch your program in the debugger. You wait for it to start up, and then try it out a bit. But then… it crashes. Your debugger eagerly jumps up tell you there’s a problem. You look through the call stack and all the program variables, and you find something strange. A value that’s just clearly wrong. Why is it 25 and not 5? Why didn’t it remove the item from the list correctly? Why is it not sorted? You look through the code to try figure out the problem, but your conclusion is that, really, it should just be right. Your code looks fine – why did it get the wrong answer?  The only way to find out is to go back in time and breakpoint it at the point where it should have been generating the correct answer, but didn’t. So you set some breakpoints, and restart the whole program. You go through all the steps to reproduce the problem again (if you can even remember them). Finally you hit the breakpoint. You step slowly through your code, and finally find the problem. Or more likely, you find some other strange thing that may have lead to the first strange thing, and have to repeat the process. Or in haste you accidentally step over the most important line of code instead of stepping into it, and have to start again.

This used to happen to me all the time. In fact, it used to be my primary method of debugging: a binary search through the code, each time rerunning it to reproduce the problem. Wouldn’t it be easier to just step backwards through the code? The program breaks when it enters some invalid state. By the law of causality, the bug must have been in code that was before the point where the strange state occurred. Stepping forwards seems like the least useful thing you can do.

Nowadays I don’t have this problem as much, and I’ll tell you the secret: Write programs in a pure functional style. Write pure functions. That is, write functions that always return the same value if called with the same arguments, and that don’t have side effects that change the state of the system when the function is called. Write code that doesn’t mutate state.

Let me give you a real example. Recently I’ve been developing a graphics library for an embedded microcontroller. The previous graphics system was stateful: you had a mutable image buffer, and then you called different subroutines, such as drawLine, to affect a change to the pixels in the image buffer. I was tasked with creating a new graphics system – in part because we now needed to render graphics to an area many times larger than the size of all the microcontroller’s RAM.

The design I produced is almost completely stateless. For example, instead of calling a rotate function to rotate a graphic, I call a rotated function , which returns a completely new graphic object representing a rotated version of the original. Did you catch that? The word “rotate” is imperative: it commands the machine to do something. The result of what it does is known its side effect1. A function like this would probably return void, or a result code to confirm its success, and the rotation would be visible as a change in the original image.

On the other hand, the word “rotated” is, in this case, an adjective2. It describes not what the function does, but what its result is. If the argument is a picture of a dog, then the result is a rotated picture of a dog. It always will be. Every time. A rotated(dog) is always a rotated dog. The unrotated dog is still there, and still unchanged.

Then there are similar functions for creating a shifted graphic from an existing one, a composite graphic from other graphics, and a line, a polygon etc. As a quick implementation note: you may have guessed that these graphic functions can’t possibly return objects with all the pixels of the graphic they represent. The way this is resolved is with laziness – the graphic objects are only implemented as lightweight descriptions of what the resulting graphic should look like relative to their input. When it comes time to produce actual pixels, there is some short-circuiting magic that combines all the transformations into one ultra-efficient transformation, and then produces the pixels on-demand with almost no buffering required3.

## Moonwalk the debugger

The great thing is that doing things this way plays really well with the debugger. These techniques work in other environments, but I’m specifically going to talk about my experience in Visual Studio. When your program stops at a breakpoint or exception, you can rewind the stack one frame at a time (in Visual Studio 2013 this is Shift+F11), and you can rewind within a particular frame by simply jumping directly to that line of code (“Set next statement”, or Ctrl+Shift+F10). If your program is stateful, this is sure to break something. Typically when rerunning the same code twice you might not expect it to do the same thing the second time because the conditions may have changed since the first time it, and so the program may be in a different state, and thus produce different results. This isn’t a problem at all if you use immutable data structures and your functions have no state or side effects. Now you can step backwards!

In addition to being able to step backwards, you can also evaluate arbitrary expressions in the debugger. For example, I have a function like saveGraphic(graphic, filename), which takes a graphic and saves it to a bitmap file. This is technically not a side-effect-free function since it writes to file, but for the purposes of debugging it has no side-effect which modify the state of the program in any observable way, and so is “pure” for practical purposes. In Visual  Studio, I can add a watch expression like saveGraphic(dog, "debug.bmp"), and the debugger will happily execute the function. Now I can see the value of my immutable graphic object! Since rendering doesn’t mutate program state, I can use “pure” functions like this to visualize any complex objects. I can even evaluate transformation functions in the watch window, such as saveGraphic(rotated(dog), "debug.bmp"), knowing confidently that my program state hasn’t changed as a result of evaluating the watch.

I hope you enjoyed this post, and learned something new that will help you with your bug battles, and challenge the way you think about coding.

1. Whether or not a function’s “effect” is its primary purpose, it is still technically known as a “side effect”

2. I believe the term is deverbal adjective

3. I would probably have normally used matrices for this, but this microcontroller has no integer multiply or divide instructions, and no floating point anything

# Finish before you Start

I find a lot of the coding I do is completely different, unlike anything I’ve done before. The reason is that if it’s anything like something I’ve done before then I’ve failed to recognize the pattern and abstract it out. As a software engineer I want to never be doing repetitive tasks, because that’s what software is for (repeating tasks). If you’re repeating yourself in any way then you’re not fully leveraging your time. Every pattern, every similarity, every “oh, I know how to do this because I did something similar with xyz” should be under careful scrutiny.

After a certain point you may run up against the limitations of a programming language, since many design patterns are simply language features that are missing. I come across this all the time in embedded C programming, where a “linked list” is not a container but the design pattern of having a next pointer in a structure (and all the algorithms that go with it), and a “virtual class” is not a data type but the pattern having a vtable pointer at the beginning of a structure.

At the point where the language betrays you, your next port of call could be writing your own domain specific language to target the problems you’re working on. In C# this may include the use of reflection to interpret existing C# data structures in new ways (such as interpreting a class as a list of serializable members), using expression trees to interpret C# expressions in new ways, or using reflection.emit to dynamically generate bytecode.

But despite the huge advantages of automating the mundane programming tasks and patterns, the problem arises: if you never repeat yourself, then everything is new. Everything is uncharted territory1. How do you know how long it will take? How do you know if it will work? Will it really be useful?

The solution is simple: finish before you start. Do whatever it takes to get the results of your experimentation before you actually perform the experiment. In software this means developing proof of concepts. Hack the result together using whatever means necessary. Spare no time for optimization or good coding technique. Don’t worry about corner cases and details. Get to the finish line by cutting across the sports field. Get to the finish line by cheating. By redefining the race. By faking the results. Calculate them manually. Hand-draw the GUI. Fake the input data. Brute-force the algorithm. Anything it takes.

It helps to clearly define the interface of whatever you’re coding: How will it interact with the rest of the system? If it’s a web page, the interface is the GUI. If it’s a library, the interface is the publicly exposed API. If it’s a DSL, the interface is a syntax or spec. Once the interface is there, it doesn’t matter if implementation is a birds-nest of cheating and killing. All that matters is that the interface offers just – just – enough to give you a little hint of what the future will look like. What it will feel like? Get a tangible product you can play with2.

Once you’ve seen the end, then you can start. A clearer picture of where you’re going will help not only in producing a better product or code, but it might change your direction completely. Once your fortune is told, it may bring news of triumph, or of peril. Perhaps when you see the end you realize it actually isn’t so great there. It isn’t as easy to use as you thought. It’s more complicated than anticipated. Perhaps the whole principle is flawed in some horrible way. There are many things that are hard to anticipate without getting a closer look. Or a hands-on experience. Perhaps you need to reconsider some of your assumptions, or abandon the whole idea altogether.

It’s better to know before you start.

1. This is an exaggeration and oversimplification

2. I use the word “tangible” very loosely. A visual interface is “tangible” if you can see it, and a library interface is tangible if you can call it, etc.