JavaScript Corners – Part 5

Here’s a quick one:

This was quite unexpected to me. It’s the only time I’ve ever seen where the left-hand side of an assignment can affect the right-hand side.

This only happens once. Once the anonymous function has a name, it can’t be re-named:

Interestingly, this doesn’t seem to work with destructuring:

This implies that somewhere between when the anonymous function is instantiated, and when it added to the array literal, the anonymous function acquires a name. It’s not the destructuring itself that suppresses the name, as can be seen in the following example:

It also doesn’t seem to work with anonymous functions passed as parameters:

It also doesn’t seem to work with anonymous functions evaluated from more complicated expressions:

 

JavaScript Corners – Part 4

Here’s another quick one. It relates to the scoping rules with parameter expressions.

Function foo has three parameters: x, f, and y. The key thing here is that the default value for f is a function that encloses the local lexical scope surrounding the function, and we’ve designed the function in such a way that when we call it it “tells” us the values of x, y, and z in the surrounding scope.

So the question is, which values of x, y, and z does f close over? There are global variables x, y, and z, there are parameters x and y, and there are local variables x, y and z. The parameter x is intentionally ordered before the function f is initialized, while parameter y is positioned to be after the function f is initialized.

I’ll spare you the time of running the code yourself. On my machine, in node.js 7.0.0, the output is [ 1, 2, 'c']. Clearly no local variables have been used, and f is capable of seeing parameter y even though it’s declared after f. The output is the same whether or not the code is executed in strict mode.

One of the most interesting things here, to me, is that the local variable x is not an alias for the parameter x.  This is strange when you consider the following code:

Given that local variables are apparently not aliases of the parameters with the same name, you would expect the output of the above to be undefined, since variable x is not initialized, even though parameter x is initializedHowever, the output is actually 42.

Where did we go wrong in our logic? Let’s see if we can get an example that explores what’s going on:

The output of this is:

This clearly tells me that variable x is not an alias of parameter x, but rather a separate variable that is initialized to the same value as parameter x.

Also, as a side note, it tells me that there is a subtle semantic distinction between initializing a variable to undefined and not initializing a variable at all, which is very interesting.

I should note that we’ve explored this empirically, not deducing the behavior from the spec. When I run the same experiment in Firefox I get a different result — both instances of y are undefined.

Which is correct?

I believe the V8 (node.js) implementation is more accurate in this case. There’s a note in the ECMAScript specification1 that says:

NOTE A separate Environment Record is needed to ensure that closures created by expressions in the formal parameter list do not have visibility of declarations in the function body.

I think this note is self explanatory, and tell us that the closure function in the parameter list is not meant to be able to see the local variables of its parent function.

P.S.

I’m going add one last investigation to this post. The above quote applies only if there are parameter expressions (if some parameters have default values). If there are no parameter expressions, then apparently this second “environment” is not created, and so the variable x should be an alias for the parameter x.

At first I thought that there was no way that we could ever test this distinction. Can you think of a way to test this difference?

Spoiler alert, I did think of a way. If the code is not executing in strict mode, then the arguments variable holds aliases for the parameters, which makes it possible to mutate parameters without referring to them by name. This is important, because once you have a local variable with the same name, there is no way to know if the local variable is a copy of the parameter, or an alias of the parameter, based purely on code that identifies it by name.

Take a look at the following experiment:

First note that y is not used in this code sample, but you’ll see why I added it in a moment.

The first log output of “1” shows that the variable x is either a copy of parameter x or an alias of parameter x. Then we continue by modifying the parameter x, without touching the variable x. Then we log the value of the variable x and see that it’s also changed — because in fact the variable and parameter x are both symbols for the same binding.

But now look at this slight modification to the code:

Note that again y is not used. In fact, to keep this a controlled experiment, y is even initialized to the same value that is passed as a parameter. The contents of the value initializer is not relevant though, because the initializer is never evaluated. What is important in this case is that there is an initializer expression at all, regardless of whether it executes or not.

The surprising thing, as I noted in the comment on the snippet, is that the second output is now 1 instead of 2, resulting from the fact that the mutation of the argument does not change the variable.

This is not a bug in node.js, but just an interesting quirk of the spec, in another corner of JavaScript.

 


  1. https://tc39.github.io/ecma262/#sec-functiondeclarationinstantiation 

JavaScript Corners – Part 3

I was busy reading ES spec today, and the following question popped into my head: at what point are default parameters evaluated?

To answer this question, I wrote the following script:

If default parameters are evaluated when the function is declared, then we would expect 'a' to appear once in the terminal output.

On the other hand, if the default parameter expressions are evaluated when the function is called, then we would expect 'a' to appear two or three times, corresponding to each time the function is called, and whether the default parameter expression is evaluated even when an argument is provided.

What do you think the output is? I won’t show you here… you can run the snippet yourself in a REPL to see if you’re correct or not.

As I did last time, I’ll leave you with this  mind-bending challenge: what does the following do?

 

JavaScript Corners – Part 2

I’m continuing my series on JavaScript corner cases.

Last time I looked how function names interact with each other and with variable names. Today I’m looking at how parameters interact with variables.

Take a look at the following code snippet:

It creates a function that takes a parameter named x, and then calls that function with an argument ((You may have heard of parameters being called “formal parameters”, and arguments being called “actual parameters”, but I think this originates largely from people who didn’t understand the distinction and so confused things for everyone else)) with a value of 1.  The curve ball here is that we’ve then also declared a variable with exactly the same name as a parameter. Does the symbol x then refer to the parameter or the variable? Do variables shadow parameters?

The variable hasn’t been given a value, so it must be undefined, right? If x refers to the variable, then the console output will be “undefined“, while if x refers to the parameter, then the console output will be “1“. So which is it…

On my computer, using node, this outputs 1. So it seems x refers to the parameter, and not the variable… right?

But now consider the following code snippet:

This creates a variable, and then gives it the value 2. According to our theory thus far in this post, the console will still output the value 1, since the x refers to the parameter and not the variable, as we saw from the previous snippet.

But if you run this code, it actually outputs 2. So it seems that it’s not that variables or a parameters “win” the battle to get into the namespace. Rather, it seems that parameters actually are themselves variables, and in the same namespace. The parameter x and the variable x are actually one and the same.

I’ll leave you with this bonus challenge: what does the following do?

 

 

Why Types Introduce Coupling

Last time I gave a long post about why C is complicated and JavaScript can make your code simpler and less buggy. Today I want to touch on another idea: that statically typed languages introduce a level of coupling in your program that’s unnecessary and makes maintenance harder.

Imagine that I ask you to write a function to simply iterate through a list of integers and output them to the console. In C, you might write the following code:

But wait a second. I didn’t tell you that the list was going to be presented a contiguous array! What if it was a linked list? Then the code might look like the following, assuming that we had previously declared the types LinkedList and LinkedListNode appropriately:

The problem in both of these is that the function we’ve written is intimately coupled to the type of the parameters it accepts. There’s no way in C to express the idea of “I want my function to accept any list-like type”. This means that we cannot reuse the above code in other circumstances where the type is even slightly different.

C++

In C++, we can address this problem to some extent by using templates. We could write the above function something like this in C++:

There are still some of the same issues in this code. For example, do we pass an iterator pair to the function (as many STL functions do), or do we pass the container itself?

To me this is a non-solution. For one thing, it’s saying that if you want to write reusable code that’s decoupled from the types of its parameters, you need to write everything in your program as templates. This is clearly not an option for many reasons, including compilation time, error messages, complexity, etc.

All that C++ templates give us is a taste of a world without explicit types. The above C++ function is decoupled from the caller simply because it does not state what type it expects.

It’s true that there are languages that perform type inference, and allow you to write pretty generic code without losing static type safety, but as with C++, the error messages and language complexity associated with this type of coding makes it difficult for beginner-to-intermediate level programmers.

C#

At first glance, you may thing that higher level statically-typed languages solve this with better abstraction mechanisms. Consider the C# code for the above function:

The above function says that it accepts any argument that is enumerable, meaning anything that can be iterated. This is fantastic from a reusability and code maintenance perspective, so it does address one side of the issue. But there is another issue lurking behind the scenes. One that may be more noticeable to people working in very resource-constrained environments like in embedded firmware.

Although the above C# code does not couple itself to the specific type of integer list, and so can be used throughout the program, it does actually still have a level of coupling at the binary level: the function accepts something that has a binary signature of IEnumerable<int> . It is not the same as a C++ template that is decoupled by having the template instantiated for multiple binary signatures.

This machine-level coupling is invisible to the programmer, which is convenient and important when it comes to managing large and complex codebases. But remember that interfaces as a language feature are normally implemented at the binary level by passing control through several levels of indirection. I won’t go into detail – you can look it up yourself if you’re interested – but here is a brief summary to illustrate my point.

The machine code must first look up a virtual dispatch table from the interface pointer, and then look up the correct entry in the dispatch table corresponding to the method you want to call, and then perform an indirect call through that entry in the dispatch table. The indirect call doesn’t go straight to the function implementation, but first through a thunk that adjusts the this  pointer from the interface’s this  pointer to the underlying object’s this  pointer (which are different because objects can implement multiple interfaces), and then the thunk redirects to the actual function. This is to perform a single function call on an abstract interface.

Although the C# language doesn’t dictate these details, you have to be aware of them anyway. The logical types are still coupled to the binary types, and so the choices you make about how generic you want your code really do still have an effect on the performance and overheads involved in your program.

JavaScript

So what’s different about JavaScript? Simply put, JavaScript doesn’t require you to specify these type details at all.  This not only brings a level of decoupling to the logical program, where you gain benefits of reusability and maintainability, but also unshackles the compiler to theoretically be able to apply a whole new range of optimizations. As compilers are getting better and better, I think this will be more and more the case, and soon the performance of a program will be more related to how few details you dictate, rather than how many, and the performance of JavaScript programs will exceed those of the C and C++.

Why JavaScript

People who have known me for a while, and long-time readers of this blog, will know that I never used to be a fan dynamically typed languages. The static type system available in C# and C++ can help to prevent many bugs and runtime errors, and allow the programmer to add extra clarity to the code.

But in recent years, I’ve come to acquire a taste for JavaScript. So much so that I’m actually on a mission to write a JavaScript compiler for embedded firmware, so that I can bring the bliss of JavaScript to the nightmare of embedded programming.

For those who don’t already know, JavaScript is a completely different language to Java. Please, do not use the word Java to refer to JavaScript, or vice-versa. They are as unrelated to each other as C++ is to Java. I know this is stated many times, but I’m expecting that some of the readers of this post are coming from a purely C/C++ background, and might not be aware of this fact.

Many people have written about the pros and cons of JavaScript, so what I’m saying here is not unique, but I’m going to approach it from the perspective of an embedded firmware programmer, who typically works in C or C++ on a daily basis. Some of what I say will also apply to those of you in other arenas.

Familiarity

One thing that JavaScript has going for it, is that the syntax and structure will be familiar to most embedded programmers (and many other programmers). It uses curly-braces to denote blocks of code. Function calls are done using parentheses with comma-separated arguments, as in foo(1,2,3). Most of the C/C++ operators work as expected, such as the logical operators &amp;&amp; and ||, and the bitwise operators &amp; and |.

These may seem like obvious things, if you’re from a background of C/C++, C#, or Java, or some related language. But bear in mind that not all languages are like this, and so the familiarity of JavaScript eases the learning curve associated when coming to JavaScript from languages like C/C++. Compare these two snippets of code, one in C, and the other in JavaScript, that implement the Fizz-Buzz challenge:

If you’re a C programmer, then hopefully the above similarity appeals to you.

Simplicity

Perhaps the number one thing I like about JavaScript is simplicity. With JavaScript you can often write very clean and readable code, because you’re focusing on the algorithm and behavior, rather than adding bloat associated with types and ABI’s and memory concerns.

In C++, you’re writing code to appease three different rulers: the type system, the machine, and the reader/maintainer. Most C++ code seems to be a compromise between these, but in JavaScript your code can focus on the reader/maintainer, and let the compiler worry about the machine. You worry more about clearly describing, in code, the behavior of the program, rather than worrying about how the compiler/interpreter will implement that behavior in terms of instructions and memory access, or worrying about constructing elaborate types to describe the meta-rules about what the program is allowed to do.

As a thumb-suck estimate, I probably spend at least half my time in C++ trying to appease the static type checker. And I would say that 90% of that work is spent on false-positives – cases where the type checker notes an inconsistency, but where the same code code in a dynamically typed language would not have had a bug in it. Inconsistencies in the type system do not always equate to real program inconsistencies.

At the risk of making this post far too long, I’m going to give you an example, albeit somewhat contrived. Let’s say that we have a binary tree structure: a node in the tree is either a leaf-node with a value, or an internal node with no value but with left and right subnodes/children. Now let’s say that we want a function that flattens the tree, returning an ordered sequence of only the leaf values.

In JavaScript

In JavaScript, I can imagine a function that looks like this:

It uses a stack to iterate the tree. I could have made a much more succinct solution using generators and recursion, but I’m appealing to those of you who are coming from strictly imperative, procedural programming backgrounds, so that’s why I chose this approach.

In C

Let’s write the equivalent code in C.

Firstly, a node can be either a single value, or branches into two subnodes. How do we represent this?

Here is one way. A node can be either a leaf value or an internal node, and to tell the difference, we probably need some kind of flag (or tag). It also needs something to store the contents of the node, which is a union between the two different options:

Side note: the style of the above code might not be what you’re familiar with. Or or it might be. I don’t know because C (and C++) doesn’t come with a standard style, which is one of the ways in which I think JavaScript is better: there is a single generally-accepted style to writing JavaScript code.

The contents of an internal node is that it has a left and right sub-node:

Hmm. There’s another question here that’s unanswered. Who owns these nodes? Does the tree own them? Could there be multiple trees that share the same nodes? Is it even the same between one tree and the next, or within all the nodes of a tree? It’s not specified here, but it’s yet another “bloaty” detail to figure out.

We said that a leaf node is “a value”. But what value exactly? Is it an integer? Another struct? Is the value type the same for all nodes in a tree? Should the memory for the value embedded into the node, or should the node point to the value? If it points to the value, then who owns it? If it’s embedded, then how big is it, and are there any rules we have to follow when copying it or moving it to different locations in memory (is it pointed to by anything else, or has ownership of anything else)? So many questions. So many details that aren’t relevant to the problem at hand.

One way here is just to say that a leaf node has a pointer to the value, and that we don’t know anything further about what type it is:

I could save on some of the bloat by using an anonymous union, but I’d argue that not a whole is gained in terms of simplifying real complexity.

How much time have we wasted so far? Let’s recap what we’ve done:

  • We’ve defined one possible implementation of a tree node
  • We’ve coupled the implementation to issues of memory layout, such as whether nodes point to their children and values or have them embedded in the same memory.
  • We’ve opened many cans of worms regarding ownership, style, the type of contents, etc.
  • We haven’t even started writing the actual function yet.

Now for the actual function (oh dear, this is going to be a long post). We need a way to pass the tree to the function, and a way to retrieve the list back. We’ve already defined a type for the tree, there are other questions to be answered when it comes to passing it to the function:

  • Should the root node be passed by value, or by pointer?
  • Is the function be expected to mutate, or free the tree passed to it? For example, can the output list reuse the memory from the input tree?
  • Should the tree be passed using a shared global variable, or an actual parameter?1

There are similar questions about getting the output from the function, with all the same concerns that we’ve already discussed about representing trees. Should the result be a contiguous array or a linked list? Should it be allocated by the caller or callee? Should it be a shared global variable? And any number of other considerations.

I’m going to try my hand at a direct implementation in C, trying to sidestep issues of memory allocation by having everything passed to the function itself:

This code makes me cringe.

It doesn’t really match the spec, because it has a hard limit on how deep a tree can be. To get around that we would need some dynamic memory allocation, which would add a whole lot more bloat and complexity. This code also requires that the caller have some idea of the maximum size of the resulting list, which may or may not be easy to know.

The complexity is ridiculous. How many times do you need to look at *list++ = node->leafContents  before you can be sure that you’re incrementing the pointer to a list of pointers, and not incrementing the pointer in the list. Maybe we need to add some more bloat to encapsulate these: more functions for managing the list so we only have to write that kind of code once. Don’t even get me started!

So let’s see how well our beloved type system did. I’m going to compile the above code and see what errors come up.

Here are the list of mistakes I made. I’m categorizing each as either true-positive (the compile error saved my skin), false-positive (using a dynamic type system I would not have had a runtime  bug), or false-negative (I found a bug by looking actually the compiler didn’t catch it).

  • I forgot to #include <stdbool.h>  … arguably a false-positive, since stdbool is only needed if you have static types.
  • I forgot to include stdio.h. True-positive: I forgot to include a module that was relevant to the program behavior.
  • In creating the example tree,  the line tree[0].internalNodeContents.left = tree[1] , I was missing an & sign. I’m going to say this is also a false-positive. I was assigning one value to another, and the fact that one value is typed by pointer and the other by value is not a concern related to the algorithm or code logic.
  • To pop a value off the stack, I used stack[stackSize--] instead of stack[--stackSize]. This is a false-negative. The type system did bugger-all to protect me from accessing invalid memory. Luckily the hardware caught it and issued a seg-fault, but on embedded hardware you aren’t always so lucky! What’s more is that code that caused the issue is unrelated to the algorithm that the function was supposed to be implementing. In a sense, it’s the implementation of a completely different algorithm (the algorithm for pop stacks). So the bug in the code was not just not-noticed by the C compiler, but it was in a real sense caused by the limitations of the C language.
  • In printf("%d", list[i]), I was logically printing an integer to the console, since the list is a list of integers, but actually the integers are physically stored as references (pointers), so it should have been printf("%d", *((int*)list[i])). Pretty, ain’t it? This is a false-negative. There was a bug, but the type checker failed to find it. Instead it just printed out crap to the console. On GCC with the default settings2, there was no warning about this.
  • I’m not returning a “success” code, or checking the return code when the function is called. This caused no error in this case, but might cause strange behavior if there was something that did check the result error code, or a case where the error code was necessary (a failure). I’d call this a true-negative in this particular case. The function acts unexpectedly, but doesn’t explicitly say otherwise so actually there’s no spec that it’s defying. What’s more is that it doesn’t introduce a bug into this particular program.

So how does that compare with JavaScript?

Well, what happened when I ran the JavaScript program? Exactly what I expected to happen. It ran. It output the correct result. No errors. No bugs.

This is not because I’m an expert in JavaScript. I have many more years’ experience in C than JavaScript. It’s because simple, bloat-free code is easy to reason about, and so less likely to contain bugs.

Conclusion: please use JavaScript instead of C. You will have fewer bugs because your code is simpler. It will also cost less to develop because there are fewer concerns to worry about, and it will be easier to maintain because the code is clear and easy to understand.

In C++

I’m not going to implement the above in C++, but instead I’m going to say, in a hand-wavy way, that I don’t think it’s much better. In C++, you could write something that looks similar to the JavaScript version, using a stack from the STL to implement the stack variable. But the problem with this is similar to the problem with C: the implementations are coupled to the machine in a way that means when you bring in your favourite container, you’re forcing the compiler’s hand when it comes to implementing the code in terms of machine instructions and memory. The result is essentially bloat in a different kind of way. It get’s messy, and to make a solution that is as generic as the JavaScript one would require a ton of code, and with it a ton of bugs.

That’s all I’m going to say for the moment. If you come from the land of C++ and want to hear my opinion in more detail, leave a comment or send me an email, and perhaps I’ll make another post about it. This one is well long enough that I should be moving on to my last point.

Safety

The above C example leads me to another great thing about JavaScript: safety.  What I mean by safety (or lack thereof) is:

  • How easy is it to introduce bugs?
  • How bad are the bugs, and how difficult are they to fix?

C is awful in this respect. The simple, real, bug in the above code where I dereferenced a pointer that wasn’t assigned, leaves the program open to the most hideous kinds of erroneous behaviors – those that are non-deterministic, and can affect anything in the program. Once you’ve crossed the line of undefined behavior, things in completely unrelated parts of your program can start failing for no apparent reason, not matter how well you wrote them. This is not just a door for bugs, but also for malicious attackers.

In JavaScript , there is no such thing as undefined behavior. There are a few things that are implementation-defined, meaning that different JavaScript engines will execute them differently, but there is nothing like C or C++’s undefined behavior, where a line of code can have literally any effect, including damaging the data or even functions in unrelated parts of the program. When you want behavior to be well-defined, use JavaScript instead of C/C++.

JavaScript is also a very good abstraction. Programs execute in isolation from the rest of the system, which is great if you have safety-critical or security-critical applications which need to guarantee some sort of behavior.

Conclusion

I could go on and on about the benefits of JavaScript, and perhaps I will in later episodes, but for the moment I hope that in this extraordinarily long post I’ve convinced you that there is some hope to JavaScript, even to the point of using it in embedded firmware development.


  1. Most people would say using a parameter is preferable, but as I’ve said before: in C you’re appeasing multiple gods. The choice of whether to use a global variable or a parameter is not just about what is easier to reason about or better for code reuse, it’s also about the function ABI and the machine instructions generated. 

  2. and the -std=c99 flag 

JavaScript Corners – Part 1

Recently I’ve been trying to write a simple JavaScript compiler, and it’s lead me to think more deeply about some JavaScript behavior that I previously would not have thought about, and I’d like to share that with my JavaScript readers.

Take a look at the following code JavaScript code, and try to figure out what it outputs to the console (I’ll give you a hint: it doesn’t output any errors):

The function code first calls foo – but which foo?

Perhaps it executes the foo 1. After all, there can only be one function called foo, and so the others may not bind correctly after the first one is declared, so the first remains the “real one”.

On the other hand, perhaps it’s not the first definition of foo that “wins”, but the last one. So could it be foo 5?

But the last definition of foo a variable named “foo”, not a function. So which wins when it comes to binding a symbol: variables or functions? Or are they treated equally?

If variable are somehow considered “second prize” to functions when it comes to finding which value matches which name, then it wouldn’t be foo 5 that wins, but rather foo 4, since foo 4 actually  has a function named foo, whereas foo 5 is only a variable named foo that holds an anonymous function.

But foo 4 is not declared at the outer scope of the function. Could it be that declarations at the outer scope win against declarations that are declared in some kind of nested scope? Perhaps then the answer might then be foo 2, since foo 2 is the last function that is actually named foo and declared in the outer block?

The only one we’ve completely ruled out, that it can’t possibly be, is foo 3. The function foo 3 is not the first foo, and not the last. It is not declared in the outer block (and if non-outer-block declarations could win then foo 4 would win). It is also nested inside a block that is only executed on a condition, and the condition is always false so the block is clearly never executed, and thus the function is never declared anyway.

So it can’t be foo 3, but it could be any of the others. Which do you think is it?


You may have guessed it based on my harpings-on. The answer, when I execute the script using node, is indeed foo 3.

In JavaScript, the nested scope is not really a nested scope. All local var and function declarations are at the function scope of the whole parent function. If you’ve been working in JavaScript for a while, you probably already know this. The fact that it’s in a “false” condition block has nothing to do with it, since function declarations are not “statements” to be executed in the sequence of the program (and if they were, then this whole snippet would fail).

What was interesting to me is that it must be the case when function and variable names clash, the functions seem to always win.

The foo 4 function is a little misleading. Just because the function has a name, doesn’t mean that it’s attached to the function scope, because in the case of foo 4 the function is a function expression. These are not even part of the function’s namespace, as you can see if you execute the following snippet, and see that it gives you an error:

Does "use strict" cure this strange behavior, and somehow give a parse failure? On my machine when I change the function to have "use strict", it doesn’t give any errors, but it does change the output to foo 2 instead of foo 3. I found this quite unexpected.

This is all interesting behavior. These things don’t affect everyday JavaScript much, but they are fun and interesting corner cases to consider, and help us understand the language more deeply.

Let me know what you think about this in the comments. Hopefully I’ll continue this series with more of the interesting corner cases that I find along the way.

 

Be a multiplier

You may have heard the analogy that some software engineers add productivity, while some multiply productivity. Today I’d like to dig a little deeper into this and share my own thoughts.

What does it mean?

For those who haven’t heard the phrase before, let me try to unpack my understanding of it. Consider a tale of two programmers – let’s call them Alice and Bob, to pick two arbitrary names. Alice’s boss gives her a task to do: she is told to add a new thingamajig to the whatchamacallit code. She’s diligent, hardworking, and knows the programming language inside out. She’s had many years of experience, and especially knows how to add thingamajigs to whatchamacallits, because she’s done it many times before at this job and her last job. In fact, she was hired in part because of her extensive experience and deep knowledge with whatchamacallits, and at this company alone must have added over a hundred thingamajigs.

Because of her great skill and experience, she gives her boss an accurate time estimate: it’s going to take her one week. She knows this because she did almost exactly the same thing two weeks ago (as many times before), so she’s fully aware of the amount of work involved. She knows all the files to change, all the classes to reopen, and all the gotcha’s to watch out for.

One week later, she’s finished the task exactly on schedule. Satisfied with the new thingamajig, her boss is happy with the value she’s added to the system. Her boss is so grateful for hiring her, because she’s reliable, hard working, and an expert at what she’s doing.

Unfortunately for the company, Alice’s great experience gets her head-hunted by another company, where she’s offered a significantly higher salary and accepts immediately. The company mourns the loss of one of their greatest, who soon gets replaced by the new guy – Bob.

Bob is clearly wrong for the job by all standards, but some quirk of the job market and powers-that-be landed him up taking Alice’s place. He has no prior experience with whatchamacallits, let alone thingamajigs. And he doesn’t really know the programming language either (but he said he knows some-weird-list-processing-language-or-something-I-don’t-remember-what-he-said-exactly, and said that he’d catch on quickly). His new boss is very concerned and resents hiring him, but the choice was out of his hands.

On his first week, his boss asks him to add a thingamajig to the whatchamacallit code, as Alice had done many times. He asks Bob how long it will take, but Bob can’t give a solid answer – because he’s never done it before. It takes bob an abysmal 2 weeks just to figure out what thingamajigs are exactly, and why the business needs them. He keeps asking questions that seem completely unnecessary, digging into details that are completely irrelevant to the task. Then he goes to his boss and says it will take him 3 weeks to do it properly. “3 Weeks! OMG, what I have I done? Why did we hire this idiot”.

There’s not much to be done except swallow the bitter pill. “OK. 3 weeks”. It’s far too long. The customers are impatient. But, “oh well, what can you do?”

3 weeks later Bob is not finished. Why? Well again, he’s never done this before. He’s stressed. He’s missing deadlines in his first months on the job, and everyone’s frustrated with him. When all is said and done, and all the bugs are fixed, it takes him 2 months to get this done.

By now there is a backlog of 5 more thingamajigs to add. His boss is ready to fire him, but he optimistically dismisses the 2 months as a “learning curve”, and gives Bob another chance. “Please add these 5 thingamajigs. How long will it take you?”

Bob can’t give a solid answer. He swears it will be quicker, but can’t say how long.

The next day Bob is finished adding the 5 more thingamajigs. It took him 30 minutes to add each one, plus a few hours debugging some unexpected framework issues. What happened? What changed?

What happened is that the first 10 weeks that Bob was spending at his new job, he immediately noticed a big problem. There were 150 thingamajigs in the whatchamacallit codebase, and they all had a very similar pattern. They all changed a common set of files, with common information across each file. The whole process was not only repetitive, but prone to human error because of the amount of manual work required. Bob did the same thing he’s always done: he abstracted out the repetition, producing a new library that allows you just to define the minimal essence of each thingamajig, rather than having to know or remember all the parts that need to be changed manually.

To make things even better, another employee who was also adding thingamajigs, Charlie, can also use the same library and achieves similar results, also taking about 30 minutes to add one thingamajig. So now Charlie can actually handle the full load of thingamajig additions, leaving Bob to move on to other things.

Don’t do it again

The development of the new library took longer than expected, because Bob never done it before. This is the key: if you’ve done something before, and so you think you have an idea of the work involved in doing it again, this may be a “smell” – a hint that something is wrong. It should light a bulb in your mind: “If I’ve done this before, then maybe I should be abstracting it rather than doing almost the same thing again!”

You could say, in a way, that the best software engineers are the ones that have no idea what they’re doing or how long it will take. If they knew what they were doing, it means they’ve done it before. And if they’ve done it before then they’re almost by definition no longer doing it – because the best software engineers will stop repeating predictable tasks and instead get the machine to repeat it for them1.

Adding and Multiplying

In case you missed the link to adding and multiplying, let’s explore that further. Let’s assign a monetary value to the act of adding a thingamajig. As direct added value to the customer, let’s say the task is worth $10k, to pick a nice round number ($1k of that goes to Alice, and the rest goes to running expenses of the company, such as paying for advertising). Every time Alice completed the task, which took her a week, she added $10k of value. This means that Alice was adding productive value to the company at a rate of $250 per hour.

Now Bob doesn’t primarily add value by doing thingamajigs himself, but instead develops a system that reduces an otherwise 40 hour task to 30 minutes. After that, every time a thingamajig is added, by anyone, $10k of value is added in 30 minutes. Bob has multiplied the productivity of thingamajig-adders by 80 times. In a couple more weeks, Bob would be able to add more value to the company than Alice did during her entire career2.

Is it unrealistic?

The short answer is “no”. Although the numbers are made up, the world is full of productivity multipliers, and you could be one of them. Perhaps most multipliers don’t add 7900% value, but even a 20% value increase is a big difference worth striving for.

The laws of compound interest also apply here. If every week you increase 10 developers’ productivity by just 1%, then after 2 years you’d be adding the equivalent value of 6 extra developers’ work every day.

The alternative

What happens if Bob was never hired? Would the company crash?

Perhaps, but perhaps not. What might happen is that Microsoft, or some big open source community, would do the multiplying for you. They would release some fancy new framework that does thingamajigging even better than the way Bob did it, because they dedicate many more people to the task of developing the library. The company will take 5 years before they decide to start using the fancy new framework, in part because nobody on the team knew about it, and in part because they now have 250 thingamajigs to migrate and the expected risks are too high for management to accept. But in the end, most companies will catch on to new trends, even they lag behind and get trodden on by their competitors.

Final notes

In the real world, it’s hard to tell Alice from Bob. They’re probably working on completely different projects, or different parts of the same project, so they often can’t be directly compared.

From the outside it just looks like Bob is unreliable. He doesn’t finish things on time. A significant amount of his work is a failure, because he’s pushing the boundaries on the edge of what’s possible. The work that is a success contributes to other people’s success as much as his own, so he doesn’t appear any more productive relative to the team. He also isn’t missed when he leaves the company, because multiplication happens over time. When he leaves, all his previous multiplicative tools and frameworks are still effective, still echoing his past contributions to the company by multiplying other people’s work. Whereas when an adder leaves the company, things stop happening immediately.

Who do you want to be – an adder or a multiplier?


  1. This is not entirely true, since there is indeed some pattern when it comes to abstracting code to the next level, and those who have this mindset will be able to do it better. Tools that should come to mind are those such as the use of generics, template metaprogramming, proper macro programming, reflection, code generators, and domain specific languages 

  2. How much more do you think Bob should be paid for this? 

Async in C

C# has an amazing feature called async, which we’ve talked about many times before on this blog, which allows a programmer to write functions that are non-blocking, without needing to use threads. What would it look like to have async functionality in C?

I’ve been working on an experimental “mini-programming-language” which does just that. It doesn’t work the same way as in C#, because the needs in C are completely different. In C you don’t want all the hidden overhead costs that exist in C# related to setting up tasks and delegates and execution context. In C, things should be more or less as they seem.

What does it look like?

In this experimental language, you can declare functions as async, to say that they don’t complete immediately in a blocking fashion, but instead may complete asynchronously. When functions are declared async, some interesting things happen. One thing is that, in the context of the async function, the word return actually refers to a function, which can be called and saved like any other function. For example, here is an async function foo which returns the value 1.

Of course this function actually returns synchronously, even though it’s declared async. The only difference is that it’s returning using continuation passing style instead of a direct return. But using this feature we could actually delay the return to another point in time:

Now we’ve saved the return continuation and only triggered it when some event returns. We discussed last time what a continuation might actually look like in C, so this week we’ll just elide the type details and say that a continuation is of type Continuation<T>, where T is the return type of the function calling the continuation. Values of this type are each physically the size of a single pointer, and can be executed using the same syntax as a function.

Now comes the interesting bit. Say we have a function, bar, which calls foo. In this experimental language, you can simply define bar like this in this experimental language:

Now clearly bar must be asynchronous as well, since it calls foo, and depends on the result of foo before it can continue. But the magic is that we don’t need to declare the asynchrony explicitly. The experimental language compiler not only infers the asynchrony, but does the corresponding conversion to CPS automatically.

This is more than just a minor convenience. Imagine code like this:

This is a relatively simple function, but if we had to write the asynchrony out explicitly in C we would have code like the following1:

This is beginning to look like unstructured code. The for-loop construct is now completely hidden, and looks more like the old days of conditional-branch-and-jump. We’re also worrying about things that the compiler really should be sorting out for you. Like passing in the return address, passing in a space for storage of local variables, etc. These are all things you generally don’t have to worry about, so why now?

The experimental language I’m working on handles all of this asynchrony automatically. If the above examples are anything to go by, then certain types of code will be reduced to a quarter of their size and be orders of magnitude easier to read, if they were written in this experimental language instead of C. I would perhaps go as far as saying that a fair amount of multithreaded code could instead be written in this async style to run on a single thread, and would as a result be much easier to reason about.


  1. again, using the continuation style defined in my last post 

Continuations in C

There are times when you need to call a function, but you want to say “call me back when you’re done” rather than blocking the current thread.

An example might be when you’re reading a file. If you imagine for a moment that every CPU cycle is on the scale of 1 second, then disk access is in the order of days to months (take a look at the coding horror post about it). When you call a simple C function like fread, you could be blocking the current thread for millions of CPU cycles. You don’t want to be blocking the thread, because threads are a valuable resource and multithreading is a difficult skill.

The Typical Solution

The typical way to solve this in C is to use a callback function. I’m not going to explain callback functions here, since there’s an abundance of information about them on the internet. Instead I would like to point out a convenient pattern of how to store state for the callback function.

Let’s use a concrete example. Say we have some function bar, which is expected to take a long time to execute, and a function foo which needs to call bar. The synchronous way of writing the code (non-callback way) might look like this:

The task finishes by returning some result of the long process. For the purposes of this example, we’ll say that the result is 42.

If we convert it to the asynchronous form (the callback form) it might look like this:

Note that normally bar would not call the callback itself, but instead save the callback to be called later. I’ve only called it directly from bar as a convenience in the example.

The Problem

I’ve seen this pattern many times. But it’s flawed in a major way: if foo has some state that must be persisted across the call to bar, how does it get that state to the continue_foo function? For example, if I declare a variable in foo, how do I access the variable in continue_foo? Typically what I see is that people will simply use global variables. But that’s an awful solution for many reasons1.

Slightly Better

A better pattern, which I’ve used myself quite often, is to for foo to tell bar, “please hold the value of XYZ for me, and when you call me back, please give XYZ back to me to remind me why I called you in the first place and help me remember where I left off”. It might look like this:

A few quick points I’d like to draw your attention to:

  • Bar only sees the type void*, and not something more specific like Foo_state, because obviously bar may be called by other functions as well, not just foo
  • Rather than allocating foo’s state on the heap, foo just accepts the state as a parameter, leaving it up to the caller to decide where it must be allocated. This parameter is only to say where the state should be stored, and is not expected to have any values populated by foo’s caller.

Let me emphasize that last point again: there is no heap allocation involved in this example. The state could very easily be statically allocated, or pool-allocated, or even stack allocated2. Especially, consider that foo’s caller is likely to face the same problems foo has faced with state management, and so might already have it’s own state structure which would provide the perfect home for foo’s state structure without incurring an additional heap allocation.

The Best Solution

But we can do even better. The problem with the above example is that we’re passing two things around: the callback function pointer, and the callback state pointer3. Let’s take a look at a way of doing this while only passing one pointer:

I’ll draw your attention to the differences:

  • Foo_state now contains a field called call which holds the callback function pointer. It’s important that this field is the first field in the structure so that a pointer to this field is also a pointer to the whole structure.
  • The callback function signature still accepts the state as a parameter, as before.
  • The call to bar no longer takes two parameters but now only takes a pointer to the callback function pointer (note the double-pointer)
  • When bar needs to call the callback function, it needs to dereference it first. It also needs to pass the callback state. But since, by design, we’ve said that a pointer to the callback function [pointer] is also a pointer to the callback state, we can simply pass that pointer as the argument. This gives us the interesting syntax (*callback)(callback, result). In a sense, this is saying “call the callback, and tell it which callback we called”.

Those who are familiar with how object-orientated programming works under the hood may recognize this pattern. Typically objects are laid out in memory such that the first field in the object state is a pointer to the class vtable. When you call a virtual member function on the object, the pointer-to-the-object is treated as a pointer-to-the-vtable-pointer and is used to resolve the virtual dispatch. In our example above there is actually less complexity and overhead, since we don’t need a whole vtable but can point directly to the function.

Advantages

I love this pattern because it’s really clean and quite simple. The whole callback, including the function and the state, is neatly represented by a single pointer4.

The callback pointer can be called using a very self-contained syntax. That is, it only depends on one variable, not two. This is actually not just a matter of syntax: a single variable means better use of CPU registers, and fewer accesses to memory.

The fact that the callback is represented by one small value also makes it easier to manage. There’s much less risk of calling the callback with the wrong state. It’s also lighter to pass around.

Disadvantages

The most obvious disadvantage to me is that it uncommon. Someone looking at the code for the first time won’t just understand what’s happening straight off the bat. It also means that there’s no language support for it. C++ is in some ways an extension to C with language support for first-class objects. But there is no common language that is an extension to C with support for this kind of first-class-function (with state).

The performance of using this pattern isn’t a disadvantage in my opinion. If you’re comparing it to the performance of a “naked” function pointer, then yes, you may incur some overhead from passing the additional state argument and from double-dereferencing the function pointer. But consider that this type of function call should actually be faster than calling a virtual function in a most object orientated languages (which has a triple-dereference), because there’s no vtable lookup. And virtual function calls are in turn typically faster than interface function calls (and correspondingly virtual functions with multiple inheritance, depending on the optimizer and conditions).

I’d also like to dispel another disadvantage, not directly related to the pattern but more about using callbacks in general. At first glance it seems that there is a lot of overhead in accessing persistent variables in the state structure, because instead of saying “x” you have to say “state->x”, which implies an extra pointer deference and possibly some pointer arithmetic. But think about this: how are variables normally accessed anyway? Variables are normally stored in the stack frame, which is essentially a structure pointed to by the stack-pointer. Yes, there may be less register elevation which would affect the performance, but I think it may be less of a problem than you’d expect.

Likewise, at first glance it seems that there is extra space used to store the callback function pointer. But in reality, a stack frame also stores the “callback” function pointer anyway: we just normally refer to it as the “return address”. An important point to note in the last example, is that the very last thing foo does is call bar. This is what’s called a tail call, and it means that any half-decent optimizer will re-use foo‘s stack frame space for bar. To put it another way: while bar is active, foo doesn’t use any stack space, but it does use space in the persistent state structure (wherever that may be), and the persistent state structure has many of the same attributes as the stack frame would have had, including a pointer into code space. From this perspective, there is no extra space required to store the callback address in the state structure.

The only thing missing is hardware support. A “normal” call has hardware support for automatically populating the return address into the state structure (aka stack frame) and saving register states etc (aka saving persistent variables). And a “normal” return has built-in support for dereferencing the stack pointer to obtain the return address pointer (note the double-pointer again) and jumping to that address, all in one step. But I imagine that if this pattern became more common in usage (probably with language support), hardware support would probably follow.

Until then, I still think it’s a great pattern to use in C, and we should all add it to our toolbox of C patterns.


  1. Please ask me – I’ll be happy to tell you all the reasons why it’s so horrible 

  2. In the less likely scenario that the caller decided to manually block the thread using thread synchronization techniques. 

  3. On most modern architectures this would just mean that it takes twice the space, since there are two pointers involved. But C doesn’t require function pointers to be the same size as data pointers. One embedded architecture I work with has function pointers that are twice the size of normal heap pointers – after all, RAM is more expensive per bit than ROM 

  4. A RAM pointer, which in some cases is smaller than a function pointer, giving it yet another advantage over the typical callback