Category: Thoughtlets

Incidental vs Accidental Complexity

Incidental vs Accidental Complexity

Developers often talk about accidental vs essential software complexity. But is there a difference between accidental complexity and incidental complexity? I think there is.

I like this definition of essential complexity:

Essential complexity is how hard something is to do, regardless of how experienced you are, what tools you use or what new and flashy architecture pattern you used to solve the problem.

(source)

That same source describes accidental complexity as any complexity that isn’t essential complexity, and this may indeed be the way that the terms are typically used. But I’d like to propose a distinction between accidental and incidental complexity:

Accidental complexity: non-essential complexity you introduce into the design by mistake (by accident), and you can realize your mistake in an “oops” moment (and consider correcting or learning from it).

Incidental complexity: any non-essential complexity you introduce into the design, which may be by mistake or on purpose (it may or may not be accidental and may or may not be accompanied by an “oops” moment).

Accidental complexity is when I accidentally knocked a drinking glass off the table and it shattered into pieces, making my morning more complicated as I cleaned it up. It’s marked with “oops” and the realization of causing something I didn’t mean to.

Incidental complexity is when I choose to turn off the hotplate on my stove by pressing the digital “down” button 8 times until the level reaches zero, when my wife later informed me that pressing “up” once would achieve the same thing (since the level wraps around under certain circumstances). I did not accidentally press the “down” button 8 times, I did it intentionally, but was still ignorant of my inefficiency.

Clearly, there’s a fine line between the two. Defining something as incidental complexity but not accidental is a function of my ignorance and depends on my level of awareness, which changes over time and various from person to person.

Why introduce unnecessary complexity on purpose?

1. Simplicity takes effort

Paradoxically, often the simpler solution is also one that takes longer to design and implement, so it can be a strategic design choice to go with a more complicated solution than a simpler one.

This is especially true for POCs and MVPs in unfamiliar domains or where the business requirements aren’t completely clear. In these cases, you can’t be confident of all the details upfront and you’re quite likely to be rewriting the solution a few times before it reaches its final state. Spending time simplifying and organizing each iteration can be a wasted effort if it will just be thrown away.

There may also just not be time for the simpler solution. The added effort defining a clean abstraction and structure takes time and there is an opportunity cost to taking longer on something, maybe through losing business revenue when the feature is delivered later, or through the cost of not working on the next feature. There are tradeoffs to be made.

2. Ignorance: You don’t know any better

This might be a little controversial, but I would argue that if you design something as simple as you can possibly conceive, then any remaining incidental complexity is not “accidental”. Like the example earlier of me pressing “down” 8 times instead of “up” once.

As a concrete example in software, JavaScript never used to have a async/await or the Promise type, and as a result, code was often structured with nested callbacks to handle sequences of asynchronous actions (known as “callback hell”). This kind of code is clearly unnecessarily complicated, no matter how hard people tried to contain the complexity. But designs based on callbacks are not accidentally complicated — they have exactly the design intended by the author which is believed to be the best possible design.

But yet, promises could be (and were eventually) implemented as a third-party library, and so could the async/await pattern (through the abuse of generator functions). So the potential was there from the beginning for every developer to structure their code in a simpler way if they could just realize the potential to do so. The complexity was not imposed as a restriction in the programming language, it was imposed by ourselves because we hadn’t been awakened to a better way of seeing things.

It’s worth noting one more thing: if you’ve ever tried to explain JavaScript Promises to someone, you’ll know that an explanation of what a Promise is or does is unconvincing. Understanding how to think in terms of Promises requires a mental paradigm shift, which doesn’t occur merely by explanation. Even a concrete demonstration may be unconvincing since the code may be “shorter”, but is it really “simpler” when it’s full of all these new and unfamiliar concepts? The way to internalize a new paradigm like this is to use it and practice thinking in terms of it. Only then can we really see what we were missing all along.

Why does it matter?

I think the distinction between accidental and incidental complexity matters because it should keep us humble and aware that the way we write software today is probably ridiculously inefficient compared to the techniques, patterns, and paradigms of the future.

It shows that certain kinds of incidental complexity can look a lot like essential complexity if we don’t yet have the mental tools to see those inefficiencies. The biggest productivity boosts we could gain in our designs might be those that aren’t obvious from the perspective of our current ways of thinking, and which might take a lot of time and effort to see clearly.

It shows that finding incidental complexity is not just a case of looking at a project and identifying what we think is needless, since there’s a good chance that we’re blind to a lot of it and all we will just see are things that make us say “yes, that thing is important for reason xyz — we can’t remove that”. Like all the callbacks in callback hell: every callback seems important, so it seems like it is all essential complexity, even though there exists a completely different way of structuring the code that would make it easier to understand and maintain.

But on the positive side, it shows that we are capable of identifying a lot of incidental complexity, if we spend enough time and effort thinking through things carefully, and if we remain open and curious about better ways of doing things.

Distributed apps in JavaScript

Distributed apps in JavaScript

This post is different from my usual topic of Microvium. I’ve been recently frustrated with the way that distributed applications are written and I’ve been brainstorming ways it could be improved. In my last post on the topic, I suggested that maybe a new programming language would be useful for solving the problem, but I ended the post thinking about how the Microvium snapshotting paradigm might also be a suitable solution. This time, I’m going to consider a solution that doesn’t require Microvium.

A quick summary of the objectives

I want the experience of developing a distributed application to be basically as seamless as that of developing a single-process desktop application, such as the ability to get type checking between services, to debug-step directly from one service to another during development, to write regression tests that encompass multiple services, etc. And I want this all with minimal boilerplate.

My last post on the topic goes into more detail about the issues I have with the current state of affairs and the improvements that could be made.

A proposed solution

Here’s the summary: I think the Microvium snapshotting paradigm is indeed the answer, and I think we can in a sense “polyfill” the snapshotting ability in a limited context by deterministically replaying IO.

I’ll talk about this solution a few parts:

  1. What is snapshotting and how could we use it to solve the problem?
  2. How can we implement snapshotting on a modern JavaScript engine?
  3. Is it necessary to do this JavaScript?
  4. More details

What is snapshotting and how does it help?

I’m using the term “snapshotting” here to mean the ability to “hibernate” a process to a file and then resume it later on the same or a different machine. This means capturing to a file the full state of the process, such as the global variables, the stack and heap state, and machine registers. And the ability to restore the full state in a completely new context.

Normally applications are deployed as either a compiled executable or as source code depending on whether the programming language is compiled or interpreted. Having the snapshotting capability gives us a third option: run the application code at build-time and deploy a snapshot of its final initialized state.

Doing it the latter way allows application code to have pre-runtime effects, such as defining infrastructure, and for pre-runtime code to pass information seamlessly to runtime code, such as secrets and connection strings for the aforementioned infrastructure.

When I talk about coding in JavaScript, I’m really talking about coding in TypeScript, and so another you get with this paradigm is type-checking across different infrastructural components and type checking between infrastructure and runtime code. There are other solutions like Pulumi that allow you to write IaC code in the same language as your application code, but to my knowledge, there are none that allow you to pass [typechecked] state directly from IaC code to runtime code.

How to implement snapshotting?

Microvium is a JavaScript engine that implements snapshotting at the engine level, but Microvium doesn’t yet implement much of the JavaScript spec and its virtual machines are size-limited to 64kB. It’s in no way practical for writing a distributed cloud application.

Node on the other hand is used for cloud software all the time, and I’m a huge fan of it, but it doesn’t support snapshotting in the way described.

But I think we can emulate the snapshotting feature on Node by having a membrane around the application that records all IO while it runs at build-time, and silently replays the IO history as an initialization step at runtime to recreate the final, initialized state. Although this will be slower than just recovering a direct dump of the memory state (if the engine had supported that), in theory, it should work.

This will only be reliable if JavaScript is deterministic (that its state and behavior deterministically follow from its incoming IO), which it almost is already. Work by Agoric on SES and Compartments makes this even more so, by allowing the creation of a sandboxed environment for JavaScript modules, in which non-deterministic JavaScript library functions can be removed by default (e.g. Math.random and new Date), or replaced with deterministic implementations. They do this because they run JavaScript on the blockchain, where a collection of redundant nodes needs to process the same script and reach the exact same conclusion about the new state of the blockchain — a strong determinism requirement.

Does it need to be JavaScript?

Honestly, I can’t think of a way to do this outside of JavaScript. JavaScript has a number of key features that I think are really useful for this solution:

  • JavaScript is already almost completely deterministic. There are only a handful of non-deterministic operations like Math.random, and JavaScript’s flexibility allows us to easily remove these or replace them with deterministic proxies. Contrast this with C#, where it’s impossible to prevent some block of code from performing non-deterministic IO or having non-deterministic behavior by using threads.
  • JavaScript allows us to create perfect proxies and membranes. You can create proxies for classes and not just instances of the class, and you can create proxies for modules. In C#, there is no way to put a membrane around an assembly, namespace, or class.
  • Related to creating perfect membranes, JavaScript allows you to hook into all IO, including the importing a dependent module. This would allow the framework to trace the depenency tree of each individual service and create the appropriate bundles for each.

I think it would be possible to get some approximations of the idea in other programming languages, but it certainly appears to me that JavaScript is particularly well suited.

More Details

Most of the above has been quite abstract. I think it would be worth explaining the vision in a little more concrete detail.

I imagine there to be a thing, which I might call a “service”, which is the minimal distribution unit (we’ll say a service is “a thing that can be deployed”). A service might be a single microservice/lambda, or a database, or a whole distributed application made up of nested services.

A typically SaaS company using this solution would probably just have one root “service” that represents their whole distributed application, and that service would be composed of subservices1, which could each have their own subservices, etc.

A service is a JavaScript module (file), which runs at build time on the build machine (or dev machine), and then with snapshotting is moved to the runtime machine(s) when it’s finished its initialization (when all the top-level code has run, including its transitive function calls, etc).

While executing at build time, the service code is then able to configure its own runtime “hardware” that it will eventually be moved to. For example, it might call a host function to declare “I’d like to run as a lambda with automatic scaling”, etc. The host can record this information and provision the necessary resources before moving the service to its own desired runtime environment.

A special case of this general rule is that a service can choose a runtime manifestation that doesn’t support code execution at all. For example, a database, queue, or pub-sub system.

A service can instantiate other services at build time. Concretely, this might look something like:

var myService = importService('./my-service-code.js');

I expect this to be roughly analogous to just importing another module into the current service, similar to using node’s require() function. It will import and execute the given JS module (in the same machine process), and return an object representing the script’s exports. But with the distinction that new module is to be loaded inside a membrane, and the return value is therefore a proxy for the actual exports inside the membrane.

The membrane serves a few different purposes:

  • When the running services are “moved” to their runtime environments, services in different membranes will be on different physical runtime hardware. Service code may at build time configure its future runtime hardware.
  • At build time, the membrane records all IO exchanges such that it can deterministically replay the IO at runtime during initialization, to restore the service to its exact “snapshotted” state, as mentioned earlier. It can verify and silently absorb outgoing IO, and deterministically replay incoming IO.
  • At build time, services are allowed to pass around references to other services, as a kind of “dependency injection” phase. At runtime, the membranes around each service need to handle the marshalling of calls between services as they use these injected dependencies.

The importService function can construct a new Compartment which allows it virtualize the environment in which the service code runs. This can have a variety of effects:

  • We can provide a replayable variation of non-deterministic builtins, such as Math.random and new Date. These can be treated as a special case of IO. They can return real dates and random numbers at build time, as long as they return the same dates and random numbers when replayed at runtime.
  • We can provide service-specific APIs. For example, a function akin to pleaseLetMeRunAsALambda() could be injected into the globals for the service, or we could expose additional builtin-modules to the service code.
  • We can intercept module imports so that we can bundle the service with its module dependencies into a single package for deployment, without the overhead of bringing in dependencies used by other services.

Pulumi

I haven’t done much research on this yet, but Pulumi seems to provide an API for defining infrastructure in JavaScript code. I speculate that the Pulumi API could be brought in as a low-level build-time API for services to define things like “I want to be a lambda when I grow up”.

Conclusion

I’m fairly confident that something like this would work, and I think it would make for much cleaner code for distributed applications. I also don’t think it would take that long to implement, given that many of the pieces already available off-the-shelf. But even so, I probably shouldn’t get too distracted from Microvium.


  1. This kind of infinitely recursive hierarchical organization of the architecture is something I’ve seen missing from AWS and Azure 

Quality vs Quantity

Quality vs Quantity

It’s an age-old debate, but as I get older, my perspective is changing.

I feel increasingly frustrated by tools that don’t work smoothly; tools that weren’t thought out very well. And I ask myself, wouldn’t my life be better without these?

TL;DR The economic incentive for product development is not always aligned with what makes the world a better place. Tools that make people more productive do not necessarily make them happier. As creators, we should first-and-foremost create things that are pleasant to use, even when it’s not economically optimal for us or the companies we work for — economics should serve humankind, not be our master.


I’d be remiss creating such a post without whining about some concrete examples:

My Washing Machine

My washing machine1 has a “quick wash” setting that can do a load of washing in about 30min — great for those in a rush. And it allows you to further customize the quick wash with things like temperature and spin speed.

But for some unknown reason, it doesn’t allow you to select the maximum spin speed while in the “quick wash” setting!

If you want your clothes well-spun, you need to select another setting, the quickest of which is “whites”, and takes three-and-half-hours!

There’s a work-around, which I use most times — you run the machine on a quick-wash cycle (30 min), and then afterward change the setting to “whites” and then select the sub-setting “spin only”.

There are many other confusing and frustrating choices made for the interface and control of this machine, but it would take a whole post to go into them all!

How many extra man-hours would it have taken that company to properly think through the HMI2 design of the washing machine? How much extra would that have added to the price of each machine?

And here’s a deeper question: does the existence of this washing machine design make the world a better or worse place? If we could magically remove all washing machines with this design from history, would we be better or worse off?

What does it even mean to be better or worse off? This depends heavily on your world view. If you are a hedonist, then happiness/pleasure is the ultimate goal, and you can mentally compare the global happiness with or without this washing machine design in it, and make a value judgement about whether the washing machine is a net win or loss.

The difference between the two hypothetical worlds is subtle and complicated. The annoyance I feel when I do the washing is brief and small, but it’s multiplied by the thousands of people who have a similarly-designed machine.

And if we magically removed the design from the world timeline, we’d also be removing the satisfaction I felt when I discovered the work-around, and the catharsis of complaining about it in this blog post, or the social bonding it provides between people with the same issue who can enthusiastically affirm each other’s miserable experiences of such a damned machine.

But even with all these “benefits” of bad design, I honestly believe that we’d all be better off, in at least a small way, if this design didn’t exist.

So, what does this mean? Should humankind band together to create the perfect washing machine — a design which can be used for hundreds of years into the future, making people all of the world happy or even excited to do their washing?

In principle, I’d say “yes”. If we as humankind had collectively spent our efforts creating just one “great” washing machine design instead of a mix of good and bad designs across many companies, we’d probably all be better off overall.

The reality, of course, is more complicated. Different people’s ideas of the “perfect washing machine” are different. Technology makes this a moving target — the best washing machine 1000 years from now will be much better than the best we can do now, but we don’t expect a washing machine company to internally do a thousand years of technological research before creating their first product!

Adobe After Effects

Adobe After Effects is software for creating and editing video and animation. I learnt how to use After Effects this year so I could create little animated diagrams like that in my snapshot-vs-bundler post.

I find After Effects to be a really interesting example when it comes to armchair-philosophy about product quality. After Effects is, in my opinion, incredibly poorly thought out and frustrating to use. But it is also the only software that can do everything that it does, and using it makes you 100x more productive than if you were to try to do the same animations/effects manually.

I won’t go into a lot of specifics of what I think is bad design in After Effects, but I can’t help but mention one example: you can’t copy-paste something from one project to another (in fact, you can’t even open two projects at the same time). If you designed something great in one project and want to use it in the next (reuse, DRY), it’s a complicated and time-consuming process to export parts of one project into a temporary project and then import the temporary project as a nested child of the second project, from which you can now copy and paste.

You’re welcome to disagree with me and think that it’s a great design. The thought-experiment remains valid: in cases where a tool achieves something that was previously unachievable, or boosts their user’s productivity by some factor, is it better for the tool to have more capability or to be more pleasant to use?

The answer is not completely obvious, and it really got me thinking. My mind is in the weird paradox of simultaneously thinking that my life is worse off with After Effects while also I choose to use it because nothing else is available to do the job.

I wonder how many other people might also be in a similar position, perhaps frustrated by tools they use, but pushed to use them because the tools make their competitors more productive. It’s a prisoner’s dilemma situation (wiki).

50 years ago, nobody was complaining about the lack of After Effects software — they were fine with its absence — but today, we complain about its flaws. We’re “forced” to use it, either because our competitors in the space are using it or, in my case, because the mere knowledge of its existence makes work without it feel much slower, which itself is frustrating.

I honestly think the world would be a better place if After Effects didn’t exist and if knowledge of its absence also didn’t exist, nor a craving for something like it to exist. Or even better, the world would be a better place if After Effects was created with more design thought, making common workflows easy and pleasant, even at the expense of its variety of features.

Lessons

As a software engineer, I’m one of the lucky few who are actually contributing to the development and design of these kinds of tools. So what kind of lessons can I learn from some of my frustrations with other tools, like the washing machine or After Effects?

For me personally, the lesson is that quality is far more important than quantity. I think that humankind will be better off with fewer tools and less productivity, in exchange for tools that are more pleasant to work with. It’s better if technological development is slower, maybe 20 years behind where it would otherwise be, but the technologies developed are robust, dependable, and a pleasure to use.

In some ways, this goes against modern business practice. In our capitalistic world, there’s a temporary advantage gained for using a tool that makes the business more productive, even if it makes its employees less happy — the benefit is negated when the competitors do the same thing, removing the competitive advantage of the productivity boost and driving down overall prices of the product (but leaving employees still unhappy with the status quo).

Likewise, on the other side, it’s advantageous for a software company to sell tools that deliver the maximum economic value to their users, not necessarily maximum pleasure and peace. Remember that automating the repetitive parts of a user’s job does not necessarily make their life better, it just changes the nature of their work (now their job is about running the automation software). If a new piece of software allows users to complete their work in half the time, they don’t take the other half of their time to sit in the park and watch the birds play in the fountain!

If you create a tool that delivers more productivity in a way that’s more frustrating to use than the less-productive way, the machine of capitalism forces people to use your more-productive tool (which may make you rich in the process!) at the expense of increasing net frustration in the world.

Microvium

(Non-technical audiences may want to skip this section)

Microvium has parallels with After Effects, in that it’s a new tool that achieves something that was previously unachievable (it brings JavaScript to devices with only 16kB of flash). It boosts productivity significantly over alternatives such as EmbedVM.

So would it be ok if Microvium brought these benefits but was frustrating to use?

I think not. I’ve spent considerable effort in making Microvium as easy and pleasant to use as possible. For example:

  • I take great care in the full experience of integrating and using Microvium.
    • The one-liner install for node hosts, or the two-file copy-and-paste for C hosts (one h file for the interface, and one c file for the library).
    • The getting-started guide demonstrates this to the extreme, leading the user incrementally through all the concepts, building up to a full native host example.
    • The getting-started guide is even part of the regression tests — the automated test infrastructure literally extracts the example code from the getting-started markdown file and make sure it all still works. There’s little more frustrating than an outdated or incorrect introduction to a new tool.
  • I spent a great deal of effort making the FFI3 design super simple.
    • I hope that the way I designed it seems “obvious” and “intuitive” to people looking at it in hindsight — this is the way that designs should be. When you read through the getting-started guide, you probably don’t think at all about how good or bad the FFI system is. It’s hopefully invisible to the user, in the same way that a good washing machine should be invisible. It should just do its job, in the least-frustrating way possible.
    • Compare the FFI design of Microvium with that of Node.js for example, where one requires complicated binding files, additional configuration files (e.g. binding.gyp), the need to understand N-API vs NAN vs the V8 API, etc. In Microvium, you can export a JavaScript function to C code using a single line of JavaScript code and no configuration file is needed:
      vmExport(1234, sayHello);
  • Even for something like the module API for Microvium (for node.js hosts), which many people won’t encounter directly, I put a great deal of thought into making the model as conceptually simple as possible.
    • Compare to something like the Compartment API for JavaScript (under development), where the explanation of its module API necessitates the definition of 6 different kinds of “module specifier”, and the distinction between “resolving”, “loading”, “binding”, and “importing”. While Microvium does all of these, it doesn’t require the user to understand these concepts in order to use it.

This is to say that I’m practicing what I preach. However much I may fall short of the ideal, I put a lot of time and energy into making Microvium a pleasure to use and simple to understand, at the expense of not delivering as many features and it taking longer to develop.

I believe this was the right choice, and I’ll definitely apply it to other projects moving forward.

What are you optimizing for?

I’ll conclude by asking the question: what are you optimizing for?

If you are a manager or leader at a company, you may be inclined optimize the company to produce profits. You may even have a fiduciary duty to do so on behalf of your investors.

Sometimes the profit incentive is aligned with making good products: after all, happy customers are returning customers.

But clearly that’s not always the case — it’s common practice to release buggy software early, so you can iterate on it and use your users as a testbed to weed out the bugs. It’s common practice to “fail fast” in small startups — getting new ideas out the door quickly so you can establish their efficacy in the market, rather than spending significant time designing something that might be pleasant to use but nobody wants it because the idea was flawed. It’s common practice to aim for a first-mover advantage, especially in markets with a network effect, to trap people in your ecosystem before competitors (who may have a more pleasant product but were slower to get there) have a chance to penetrate the market.

Should fiduciary duty or duty as an employee transcend the betterment of mankind?

What goals are you optimizing your products for? What are you optimizing your company for? What are you optimizing your life for?


  1. Actually, my housemate’s washing machine 

  2. Human-machine interface 

  3. Foreign Function Interface — the way in which script code communicates host code and vice versa 

A Distributed Language

A Distributed Language

TL;DR I think it would be great to have a language where single instance of a program can run non-stop for 10 years across a dynamic, heterogeneous mix of hardware, where the program code can perform its own allocation and management of infrastructure resources such as databases, VMs, and public IP addresses, as well as allowing maintainers to safely hot-swap new code into the running program to update it.


This is a deviation from my normal topic — Microvium — to talk about a random idea1.

I recently got a new job at Flip Logistics. Like many businesses an engineer might get a job at, they offer a SaaS solution: a cloud-hosted application with a web front-end, some native apps for various devices, some public APIs for integration with 3rd parties, and backend storage and logic. So far, this aspect is the same as every company I’ve worked at, and I’m sure the same as many others.

Being a newcomer is a great time to see things with fresh eyes. In this post, I’ll go over some of the issues I see with the construction of SaaS software in general (none of this is specific to Flip in any way, and I don’t speak on behalf of Flip in this post), and propose a direction that may resolve a lot of the complexity.

The End Goal

I think the best way to express the issues I see with SaaS2 implementation is to contrast it to the implementation of a single native application. I believe that much of the implementation inefficiency/complexity of a cloud application comes from the fact that it’s distributed: it exists beyond a single OS process. A typical programming language only has first-class support for things pieces running within a single process.

Self-Contained

A native application is self-contained: it starts, it allocates all the resources it needs, including all the interfaces it uses to communicate to the outside environment (e.g. the user).

Contrast this with distributed applications where resources are imposed externally: virtual machines, databases, IP addresses, certificates, firewall rules, etc.

Imagine if simple applications worked this way: to start MS Word, you first need to go to your OS Dashboard where you instantiate a (blank) GUI window for MS Word to eventually use; then you instantiation a file for it work on; and reserve the amount of RAM you want it to run in, and the core of the processor, etc. Then you run your configuration manager to associate the MS Word “process” with the RAM, the file, the GUI window, etc. Then finally you boot the whole thing up and now you can edit your Word document.

Conversely, imagine a cloud “application” that you could just run (instantiate) by the equivalent of just “double-clicking” on it, and letting it allocate its own resources: VMs, network configuration, firewall rules, load balancing, etc.

Documentation, APIs, and Type Safety

In most application software I’ve written, little or no technical documentation needs to exist outside the source code itself. I’m a huge fan of self-documenting code, where the name and type signature of a function or class describes all you really care to know about what it does3.

The information in interface signatures is not only useful for humans, it’s useful because the type system can ensure that implementations and clients of the interface are in agreement about the contract that must be met. This is safe.

But this only applies to interfaces within the context of a single application. As soon as you enter the realm of distributed applications, you now have the issue of defining contracts between the individual distributed components. Much of the time, these are not type-checked, so producers and consumers have no guarantee that contacts are met.

These kinds of inter-service contacts are normally documented externally to the code, where it’s easy to have them get out of date. And hitting “navigate to definition” in your favorite IDE doesn’t take you from the client code to the server code where you can actually see the implementation of the API you might be invoking.

Here’s an example to illustrate the lack of type checking:
In C#, I can bring a queue into existence by new Queue<T>(). Or I could use a 3rd party queue library with similar syntax. The type checker enforces that the contract between the producer and consumer side of the queue. Contrast this with an AWS or Azure distributed application, where instantiating a queue service is done outside your program code and there is no type checking to verify the contract between producers and consumers of the queue, or to self-document the kind of things that consumers should expect to get out of the queue.

This isn’t a straightforward problem to solve. A good type system for distributed applications would probably need to account for tearing between the client and server, where the client and server are on different versions — so contracts defined by type signatures would need to span into the past and future.

Debugging

I’m not sure that any solution in the near future can solve this problem, but the debugging experience for a distributed application is severely lacking.

One area of issue is the fact that you cannot step across service boundaries. While debugging a client of an API, I can step into other functions within that client, but I cannot step from the client to the server code and see both the client and server in the same “call stack”.

Prior Solutions

I’m sure many people have tried to solve these kinds of problems before, to varying success. Two existing solutions off the top of my head:

ASP.NET Web Forms. Within the limited problem-space of a client web application and its backend server, Web Forms attempts to bridge the gap between client and server, having both in the same project and a fair amount of type safety between them. This only works for certain types of applications, and is far from a general solution to the issue.

Infrastructure as Code (IaC), in various shapes — this allows code to specify the so-called “infrastructure”, such as what VMs to create and how to route network traffic to them. This gets part of the way there, but it falls short of the ideal.

To go back to the MS Word analogy, IaC seems akin to having a script that you must run prior to running MS Word, so that the script can set up the “infrastructure” required for MS Word (reserving RAM, creating the file, instantiating the GUI, etc). This still seems unnatural, and it doesn’t give you type checking between the “infrastructure” and the other code.

My Solution

Starting at the end and working backwards, I want a solution where allocating a queue service is as easy as allocating a queue data structure. Something like:

// Instantiate a queue data structure in memory
let localQueue = new Queue<int>();
// Instantiate a queue service, which may in turn instantiate the VM it needs, etc
let infrastructureQueue = new QueueService<int>();

(I’m using a TypeScript-esque language here for my examples)

Similarly for a database:

let db = new DynamoDB<MySchema>({ ...options });

I don’t mean this to create a connection to a database, I mean that these actually create a database. The returned db variable would therefore be a local object representing the remote resource.

Application Lifetime

I also believe that the code that creates the database should be part of the application code. Just like in a simple native application, if a resource is needed by the application, the code at startup should acquire that resource (or it can be acquired on-demand when it’s later needed).

MS Word is instantiated when you double-click the icon (and this is when it acquires many of its resources, such as creating the GUI window), and it’s lifetime might be minutes, hours, or maybe days, before you shut it down and it reclaims all it’s resources.

The lifetime of a distributed cloud service is much longer — you might start it up in production one day, and then retire it 10 or 20 years later. If it allocates a database at startup, you would expect that database to be around for the full 20 years.

This should drive home a paradigm shift I’m trying to communicate: a distributed application, as I’m using the term, is not the thing we currently call a “service” or “microservice”. Rather, the code for the “distributed application” is something that might run for a decade, or a century.

An example is “the Google search engine”, which from the outside is just an application available at google.com which has been running continuously since 1998.

Today’s languages and application environments are simply not suited for this paradigm. We’re used to application code being immutable for the duration of the application: it needs to be right before you run the application. If you need to update the application code, you need to stop the application, and it releases all its resources to deploy the update.

When we move into the domain of distributed applications, and these resources include things like databases, message queues, load balancers, etc., we can’t afford to have the application tear them down when we “close” the application to update its code (nor can we afford to close the application to update its code).

But we may be heading towards a world where it’s possible to mutate a running application. An example that comes to mind with JavaScript is React-JS Hot Reloader (I think this video shows it in action). It is intended as a development tool: changes to your code files are detected when you hit “save”, and the changes are injected into the running application without stopping it. In particular, the in-memory state of your components and store are preserved.

I’m not suggesting that Hot Reloader is the solution here, but just that it shows that the paradigm is unfolding. It’s not unreasonable to think of a future where a single distributed program (single code project with a single entry point) can run for years on a cloud of physical machines and other hardware, allocating and deallocating resources according to the rules it defines in its code — such as making a new database for each client, and spinning up VMs according to its dynamic load4 — and allowing us to maintain it while it continues to run by hot-swapping code into it.

I think that likely the ultimate solution here involves a new language that is designed to handle this kind of thing. A language in which the capability of hot-swapping pieces of code is assumed from the start. A language with type-safety across a non-atomic release (patching different physical machines at different times). A language which can describe processes that run on unreliable hardware and with unreliable communication.

But possibly a lot of progress can also be made with existing languages. Maybe in future I’ll go into more detail about what that might look like.

Microvium

I know that this post isn’t about Microvium, but I want to tie this to Microvium as well, because there is certainly some overlap. I won’t dwell on it because Microvium was not designed to solve this problem.

The key feature of Microvium that has relevance here is the snapshotting feature. It plays with the relationship between code and physical hardware in a novel way:

  • A Microvium process (a running instance of a Microvium program) doesn’t run on just one machine. It is in some sense a “distributed” application because it runs first in one environment and then is suspended and moved to a different environment. In a typical use, it will first run on a server or development machine, and then resume execution on a client or microcontroller.
  • A Microvium process, and by extension the resource is it allocates (such as memory structures) can transcend the physical hardware on which it runs. For example, you can kill power to the machine, releasing all physical RAM, but yet the Microvium app can still have live objects in that powerless state. The program cannot make forward progress without some CPU, but because of the snapshotting feature, the machine (CPU and RAM) that continues execution is not necessarily the same that starts it — the program can jump between machines while keeping it’s live state.
  • A Microvium program can control its own compilation5, because compilation == snapshotting. I previously wrote about how snapshotting can supersede bundling and be superior to it. But looking at this more abstractly, we can think of this as internalizing an externally-imposed workflow, which is similar in principle to my proposal to bring IaC scripts into the program that runs on that infrastructure: giving a program its own control over the infrastructure and perhaps even the build process.

While Microvium may touch on some of the concepts required for a solution, just like Hot Reloading, it’s clearly not the complete picture.


  1. Although I actually do talk about Microvium at the bottom 

  2. I’m going to use the terms “SaaS application”, “cloud application” and “distributed application” interchangeably in this post, on the assumption that most large SaaS system are complicated because they run over distributed hardware 

  3. Or at least I strive toward this end goal 

  4. All by rules defined _in the program code_, or delegated to your favorite third party library, not necessarily baked into the platform 

  5. I’m using the term “compilation” here to mean compiling the source code to bytecode 

Progress on MetalScript and random thoughts

Progress on MetalScript and random thoughts

This has been a good week for MetalScript.

This post is more of an informal ramble. Read it if you extra time on your hands, but I don’t think I’m going to say anything profound here. Also read this if you’re doing a product competing with MetalScript, since I’m spilling some of the implementation here ;-)

A few weeks or months ago1, I switched over from MiniLanuage to MetalScript. MiniLanguage is a new language I started about 9 months ago to test-drive a number of the MetalScript principles in a simplified context. It’s in MiniLanguage that I’ve created a working end-to-end pipeline from source text to output binary. But rather than perfect MiniLanguage with all the bells and whistles, I wanted to move back to the real project.

Breakthrough — bridging compile time and runtime

I had a breakthrough this week in terms of structuring code that implements the spec in a way that splits between runtime and compile time. In particular, the ParseScript operation (and some of the surrounding code) has been a bit of a pain in the butt — the behavior of the function according to the spec is to return a Script Record, which contains both a reference to the realm2, and also a reference to the script code.

The script code is purely a compile-time construct, while the realm is purely a runtime construct. So ParseScript is kinda split between runtime and compile time. It’s really awkward to consolidate code that is split in this way, while still trying to make it maintainable. And I’ve been bashing my head against a wall for a while on the right way to do it.

Something finally clicked this week, and I found a way to have both runtime behavior and compile time behavior defined in the same lines of code. The way it works is essentially that my implementation of ParseScript returns a monad, that contains both the compile time component of the return value, as well as the sequence of runtime IL operations required to get the runtime component of the return value. The caller can immediately use the compile time component, and then is obliged to also emit code that invokes the runtime component.

For simplicity, I opted to implement the monad as just a tuple, as described by the MixedPhaseResult type below.

/** Used as the result of an operation that has both a compile time and runtime
 * component. The compile time component of the operation is the function
 * itself, and its result is the first element in the returned tuple. The
 * runtime component is represented as an IL function that is the second part of
 * the tuple. */
type MixedPhaseResult<T> = [T, il.ILFunction];

...

// https://tc39.github.io/ecma262/#sec-parse-script
function parseScript(unit: il.Unit, sourceText: string): MixedPhaseResult<ParseScriptResult> {
  ...
}

...
  const [parseResult, parseScriptRT] = parseScript(unit, sourceText);
  const scriptRecord = code.op('il_call', parseScriptRT, realm, hostDefined);

(Side note: Yes, of course MetalScript is being written in TypeScript, and the plan is to make it self-hosting so I can compile MetalScript with MetalScript in order to distribute an efficient binary executable of the compiler).

The beauty of this approach is that it allows me to have a single parseScript function in my implementation, which as you can see above has an embedded comment that references the exact location in the spec, and the full behavior of the corresponding piece of the spec is fully encapsulated in the body of parseScript. This one-to-one relationship between spec functions and implementation functions is going to be super useful from a maintenance perspective — keeping up to date with the latest spec — which as stated in a previous post matches one of my goals with MetalScript.

I’ve used this technique in a number of other places that bridge the gap between compile time and runtime, and I think the result is beautiful.

MetalScript Unit Tests

I’ve also started writing unit tests for MetalScript. Previously I avoided unit tests because there was too much uncertainty and I landed up completely changing my mind on things too often to make unit tests a useful addition. Now after spending 9 months in MiniLanguage, I feel I’m reaching a point of stability in the underlying concepts and have started adding unit tests.

The first unit tests I have working use the above parseScript and related functions to translate a source text to IL. And as of today I have this working for 2 simple test cases — an empty script and a “Hello, World” script.

I can’t show you the IL itself, because it would give away too many of the internals. Maybe when I’m further along in the project and there is less risk of having my ideas stolen, I will give up more details. But believe me when I say the IL is a thing of beauty.

The hello-world script translates to 62 lines of IL (including some whitespace), which is a lot, and emphasizes how many operations are actually required to perform simple tasks in JavaScript, and how much of an accomplishment it is to get to this point. Bear in mind that this IL language is designed by me with the intention of compiling easily, not to be a compact representation of the program, since the IL will never get to the target device.

Personal Note: Be comfortable with your work

A personal lesson I’m learning with this and other projects, is to do what it takes to feel comfortable with what you’ve done. In MetalScript, it’s a constant battle in my mind as to whether I should cut corners to save time and get to a POC quickly, or whether I should take it slow and make sure that every piece is as simple, understandable, reliable, and maintainable as possible.

There are arguments for both at different stages of a project, but if you plan on the project becoming something big, then I really believe you need to do what it takes to feel emotionally comfortable with what you’ve done. The reason is that when you leave a piece of code, and in the back of your mind you think of it as hacky, fragile, or overly complicated, and when you wrote it you just had to pray that it worked, then you aren’t going to want to go back to it, and you will become generally demotivated by your work. But if you leave a project or piece of code feeling comfortable about it, then it will be much easier to go back “home” to it in future.

So when I say that you should spend time on your work until you feel comfortable with it, I’m not talking about spending time making the most advanced piece of code you can be proud of, that has a gazillian features and can do backflips and handstands and handle a bunch of different use cases. I’m talking about thinking really carefully about how to remove complexity from your code and distill it down to its bare essence. You want to use your superpowers to remove complexity, not to handle it. Understanding a complicated design is only the first step; reducing it to a simple design is the end goal.

If your code is clean, simple, and has a good readme and guiding comments to help newcomers get into it, then you will feel more comfortable when you are the newcomer getting back into it after some time.

 


  1. Time is a blur 

  2. The realm is a collection of builtin objects such as Array and Object 

Coding on Paper in an Interview

Coding on Paper in an Interview

My company is in the process of hiring a back end or full stack software engineer. During the interview process, I often get complaints from candidates when I bring out an old fashioned pad of paper and pencil and ask them to write some code for me (or I offer for them to code on the whiteboard).

This is 2017, and we have syntax highlighters, auto-complete, Stack Overflow, etc. Why would I want to test a candidate’s ability to code on paper?? This seems barbaric and absurd. Like asking a racing car driver to ride around a track on a bicycle to see how well he handles the corners. Surely we’re past this antiquated form of testing in this day and age?

Well, no. Let me share my side of the story.

I need people who can code

The position is for a software engineer. I need to know that you can code1. It’s quite simple.

We can have a good chat about Azure, about how you parted the red sea, or how you helped get a man on the moon. But in the end, you’re here to code, so you need to show me that you can.

Hiring a coder without seeing them code, is like hiring a chef without tasting their food.

The number one best way to test your real-world coding ability would be to have you work with me for 3 months, but that’s not a practical way to screen people for a new position.

Another way for me to establish your level of skill would be for me to look at work you’ve done in the past. AKA your public contributions to GitHub, Stack Overflow, blogs, etc. I am perhaps one of the few managers who will actually clone your GitHub repos and look through your code, your commit history, your documentation, your blog posts2, your answers on forums, etc.

A quick side note on this: remember that I’m also a software engineer, like you. I understand perfectly well that there are times when you need to throw a heap of “poor” code together, just to get something done quickly, or because it’s just an experimental throw-away project that you were working on in your spare time. I won’t judge you on this. The signature of good style and craftsman ship will still shine through, even in your darkest code, so I still read it.

But, I can’t just look at past work you’ve already done. I also need to look inside your head. I need to see those cogwheels spinning when you’re working on a problem. I don’t just want to see the final outcome, I want to know what thoughts went through your mind on the journey to get there. The best way to do this is for us to work on a piece of code together, in an interview. Here you can explain to me what you’re thinking, and ask questions3

The code itself is the easy part

In the past, I used to ask two coding questions in an interview. I started with an easy one, and then would progress to the harder question. But I became tired of the emotional distress of watching people struggle through the harder question, and often eventually give up.

So, now I just ask the easy question4.

In fact, here it is. I’ll give it to you.

Write a function that adds up a collection of numbers.

That it.

That’s it?

Yes. That’s it.

In fact, I’ll give you the5 answer too6 :

int Sum(int[] numbers)
{
    int counter = 0;
    foreach (int number in numbers)
        counter += number;
    return counter;
}

Here’s another one:

public static double MySum(this IEnumerable<double> numbers)
{
    return numbers.Aggregate(0.0, (acc, x) => acc + x);
}

Side note: One person has pointed out that they would just use the Sum function that .net provides. Well done. Ten brownie points for you7. But I then I will follow on by saying something like “Somebody on the .net development team had to write that builtin Sum function. Pretend it was you. How would you write it?”

There are obviously many different ways to write some code that satisfies the initial requirements statement of this test question. This is true in the real world as well — feature requests come in many different forms, to varying degrees of specification, and you need to be able to dig in and understand the problem that the user is trying to solve, and explore different solutions and the effectiveness of each.

Be resourceful. Dig in.

I don’t care whether or not you remember the syntax for generics or extension methods in C#. I don’t care if you remember what the C# equivalent to a fold/reduce function is. These are questions that Google or Stack Overflow can answer in the real world.

I don’t care about your spelling. This is something a spell checker would check in the real world.

I want to see how you engage a problem. When you’re actually working on the job, you’re going to have all sorts of weird and wonderful requests thrown your way. How do you deal with them? Do you blindly start pecking at the keyboard, or do you dig into what’s actually being asked?

If you need some tool to help you, do you say “this company is so stupid, they don’t even provide xyz!”, or do you say “how do we get xyz?”.

The same is true in an interview. Do what you need to do to get the job done. If you need to understand the context more, then ask me. If you need auto-complete and an IDE, or Google, ask for a laptop (actually I typically offer one).

At the very least, I expect you to ask me what kind of numbers we’re adding together.

If you need help remembering the way something works, or the syntax for something, ask me. Let’s work through it together. I want to know that you can work effectively with me and others on the team. I want to know that you feel comfortable asking for advice, that you’re not the kind of person to suffer in silence.

If you think that the whole thing is stupid, let me know. I want to know that the people on my team can challenge me and can stand up to being challenged themselves.

It’s about why

It’s not about what your answer looks like. It’s about why you made certain choices. Were there performance concerns you were worried about? What usage scenarios were you planning for — and did you ask? Is one language construct better than another from a maintainability standpoint? What approaches did you consider, and how did you weigh up each?

If you’re [un]lucky enough to interview with me, keep in mind that the main thing I’m looking for is your reasoning and approach when encountering a problem. Talk me through your thinking.

It’s a toy

Yes, this coding exercise is a toy example. And yes, I’ll make stuff up as I go. If you ask whether the numbers need to be integers or floats, I’ll make the answer up as a I go, and it might be different every time I do an interview.

But put yourself in my shoes for a moment. In an interview setting, we only have time to play with a toy example. We don’t have time for me to explain a real problem and have you craft a real solution for me. I need to know how you code, and a toy example is the only tool in the box that can get the job done effectively in a short span of time.

Hopefully by now you understand that it really won’t help you to use a laptop in my interview. The things I’m looking for are things where a laptop won’t help you. Sorry. But do whatever makes you feel comfortable.


  1. …and that you can communicate effectively, and work well with my team, and have the relevant experience. But all of this is pointless if you can’t code. 

  2. I don’t care what the blog posts are about actually — I will evaluate you based on passion, ability to communicate clearly, etc. Bonus points if your blog is about coding, and I can get a two-for-one evaluation on communication and coding. 

  3. Like maybe “WTF are we doing this for?” 

  4. [Edit:] Since writing this post, I’ve changed my mind about the second, “hard” question. I now ask the hard question as well, depending on how much time is available. I might also ask a “time-boxed” version of the hard question — saying something like “do as much as you can of this in 10 minutes”. Part of the reason is that I want to give you the opportunity to present your skills on structurally different questions. 

  5. *an 

  6. Since the position is for an Azure developer, the most common language of choice for this question is C# 

  7. Even if you didn’t know that it existed, I’d give you points for saying “I’m sure this function already exists in some library somewhere. I would probably first work out if I can use that, rather than reinventing the wheel” 

Be a multiplier

Be a multiplier

You may have heard the analogy that some software engineers add productivity, while some multiply productivity. Today I’d like to dig a little deeper into this and share my own thoughts.

What does it mean?

For those who haven’t heard the phrase before, let me try to unpack my understanding of it. Consider a tale of two programmers – let’s call them Alice and Bob, to pick two arbitrary names. Alice’s boss gives her a task to do: she is told to add a new thingamajig to the whatchamacallit code. She’s diligent, hardworking, and knows the programming language inside out. She’s had many years of experience, and especially knows how to add thingamajigs to whatchamacallits, because she’s done it many times before at this job and her last job. In fact, she was hired in part because of her extensive experience and deep knowledge with whatchamacallits, and at this company alone must have added over a hundred thingamajigs.

Because of her great skill and experience, she gives her boss an accurate time estimate: it’s going to take her one week. She knows this because she did almost exactly the same thing two weeks ago (as many times before), so she’s fully aware of the amount of work involved. She knows all the files to change, all the classes to reopen, and all the gotcha’s to watch out for.

One week later, she’s finished the task exactly on schedule. Satisfied with the new thingamajig, her boss is happy with the value she’s added to the system. Her boss is so grateful for hiring her, because she’s reliable, hard working, and an expert at what she’s doing.

Unfortunately for the company, Alice’s great experience gets her head-hunted by another company, where she’s offered a significantly higher salary and accepts immediately. The company mourns the loss of one of their greatest, who soon gets replaced by the new guy – Bob.

Bob is clearly wrong for the job by all standards, but some quirk of the job market and powers-that-be landed him up taking Alice’s place. He has no prior experience with whatchamacallits, let alone thingamajigs. And he doesn’t really know the programming language either (but he said he knows some-weird-list-processing-language-or-something-I-don’t-remember-what-he-said-exactly, and said that he’d catch on quickly). His new boss is very concerned and resents hiring him, but the choice was out of his hands.

On his first week, his boss asks him to add a thingamajig to the whatchamacallit code, as Alice had done many times. He asks Bob how long it will take, but Bob can’t give a solid answer – because he’s never done it before. It takes bob an abysmal 2 weeks just to figure out what thingamajigs are exactly, and why the business needs them. He keeps asking questions that seem completely unnecessary, digging into details that are completely irrelevant to the task. Then he goes to his boss and says it will take him 3 weeks to do it properly. “3 Weeks! OMG, what I have I done? Why did we hire this idiot”.

There’s not much to be done except swallow the bitter pill. “OK. 3 weeks”. It’s far too long. The customers are impatient. But, “oh well, what can you do?”

3 weeks later Bob is not finished. Why? Well again, he’s never done this before. He’s stressed. He’s missing deadlines in his first months on the job, and everyone’s frustrated with him. When all is said and done, and all the bugs are fixed, it takes him 2 months to get this done.

By now there is a backlog of 5 more thingamajigs to add. His boss is ready to fire him, but he optimistically dismisses the 2 months as a “learning curve”, and gives Bob another chance. “Please add these 5 thingamajigs. How long will it take you?”

Bob can’t give a solid answer. He swears it will be quicker, but can’t say how long.

The next day Bob is finished adding the 5 more thingamajigs. It took him 30 minutes to add each one, plus a few hours debugging some unexpected framework issues. What happened? What changed?

What happened is that the first 10 weeks that Bob was spending at his new job, he immediately noticed a big problem. There were 150 thingamajigs in the whatchamacallit codebase, and they all had a very similar pattern. They all changed a common set of files, with common information across each file. The whole process was not only repetitive, but prone to human error because of the amount of manual work required. Bob did the same thing he’s always done: he abstracted out the repetition, producing a new library that allows you just to define the minimal essence of each thingamajig, rather than having to know or remember all the parts that need to be changed manually.

To make things even better, another employee who was also adding thingamajigs, Charlie, can also use the same library and achieves similar results, also taking about 30 minutes to add one thingamajig. So now Charlie can actually handle the full load of thingamajig additions, leaving Bob to move on to other things.

Don’t do it again

The development of the new library took longer than expected, because Bob never done it before. This is the key: if you’ve done something before, and so you think you have an idea of the work involved in doing it again, this may be a “smell” – a hint that something is wrong. It should light a bulb in your mind: “If I’ve done this before, then maybe I should be abstracting it rather than doing almost the same thing again!”

You could say, in a way, that the best software engineers are the ones that have no idea what they’re doing or how long it will take. If they knew what they were doing, it means they’ve done it before. And if they’ve done it before then they’re almost by definition no longer doing it – because the best software engineers will stop repeating predictable tasks and instead get the machine to repeat it for them1.

Adding and Multiplying

In case you missed the link to adding and multiplying, let’s explore that further. Let’s assign a monetary value to the act of adding a thingamajig. As direct added value to the customer, let’s say the task is worth $10k, to pick a nice round number ($1k of that goes to Alice, and the rest goes to running expenses of the company, such as paying for advertising). Every time Alice completed the task, which took her a week, she added $10k of value. This means that Alice was adding productive value to the company at a rate of $250 per hour.

Now Bob doesn’t primarily add value by doing thingamajigs himself, but instead develops a system that reduces an otherwise 40 hour task to 30 minutes. After that, every time a thingamajig is added, by anyone, $10k of value is added in 30 minutes. Bob has multiplied the productivity of thingamajig-adders by 80 times. In a couple more weeks, Bob would be able to add more value to the company than Alice did during her entire career2.

Is it unrealistic?

The short answer is “no”. Although the numbers are made up, the world is full of productivity multipliers, and you could be one of them. Perhaps most multipliers don’t add 7900% value, but even a 20% value increase is a big difference worth striving for.

The laws of compound interest also apply here. If every week you increase 10 developers’ productivity by just 1%, then after 2 years you’d be adding the equivalent value of 6 extra developers’ work every day.

The alternative

What happens if Bob was never hired? Would the company crash?

Perhaps, but perhaps not. What might happen is that Microsoft, or some big open source community, would do the multiplying for you. They would release some fancy new framework that does thingamajigging even better than the way Bob did it, because they dedicate many more people to the task of developing the library. The company will take 5 years before they decide to start using the fancy new framework, in part because nobody on the team knew about it, and in part because they now have 250 thingamajigs to migrate and the expected risks are too high for management to accept. But in the end, most companies will catch on to new trends, even they lag behind and get trodden on by their competitors.

Final notes

In the real world, it’s hard to tell Alice from Bob. They’re probably working on completely different projects, or different parts of the same project, so they often can’t be directly compared.

From the outside it just looks like Bob is unreliable. He doesn’t finish things on time. A significant amount of his work is a failure, because he’s pushing the boundaries on the edge of what’s possible. The work that is a success contributes to other people’s success as much as his own, so he doesn’t appear any more productive relative to the team. He also isn’t missed when he leaves the company, because multiplication happens over time. When he leaves, all his previous multiplicative tools and frameworks are still effective, still echoing his past contributions to the company by multiplying other people’s work. Whereas when an adder leaves the company, things stop happening immediately.

Who do you want to be – an adder or a multiplier?


  1. This is not entirely true, since there is indeed some pattern when it comes to abstracting code to the next level, and those who have this mindset will be able to do it better. Tools that should come to mind are those such as the use of generics, template metaprogramming, proper macro programming, reflection, code generators, and domain specific languages 

  2. How much more do you think Bob should be paid for this? 

Needy Code

Needy Code

Today I was reading about an implementation of a MultiDictionary in C# – a dictionary that can have more than one value with the same key, perhaps something like the C++ `std::multimap`. It all looks pretty straight forward: a key can have multiple values, so when you index into the `MultiDictionary` you get an `ICollection<T>` which represents all the values corresponding to the key you specified. This got me wondering, why did the author choose to use an `ICollection<T>` instead of an `IEnumerable<T>`? This wasn’t a question specific to `MultiDictionary`, but more of a philosophical musing about the interfaces between different parts of a program.

What is IEnumerable/ICollection?

In case you’re not familiar with these types, the essence is that `IEnumerable<T>` represents a read-only set of items (of type `T`), which can be sequentially iterated using `foreach`. Through the use of LINQ extension methods we can interact with `IEnumerable<T>` at high level of abstraction. For example we can map an `IEnumerable<T>` `x`using the syntax `x.Select(…)`, and there are methods for many other higher-order functions. Another great thing about `IEnumerable<T>` is that its type parameter `T` is covariant. This means that if a function is asking for an `IEnumerable<Animal>`, you can pass it an `IEnumerable<Cat>` – if it needs a box of animals, a box of cats will do just fine.

An `ICollection<T>` is an `IEnumerable<T>` with some extra functionality for mutation – adding and removing items from the collection. This added mutability comes at the cost of covariance: the type parameter of `ICollection<T>` is invariant. If a function needs an `ICollection<Animal>`, I can’t give it an `ICollection<Cat>` – if it needs a storage for animals, I can’t give it a cat storage because it might need to store an elephant.

What to choose?

`IEnumerable<T>` is a more general type than `ICollection<T>` – more objects can be described by `IEnumerable<T>` than by `ICollection<T>`. The philosophical question is: should you use a more general type or a more specific type when defining the interface to a piece of code?

So, if you’re writing a function that produces a collection of elements, is it better to use `IEnumerable` or `ICollection`1 (assuming that both interfaces validly describe the objects you intend to describe)? If you produce an `IEnumerable` then you have fewer requirements to fulfill, but are free to add more functionality that isn’t covered by `IEnumerable`. That is, even if the code implements all the requirements of `ICollection`, you may still choose to deliberately only expose `IEnumerable` as a public interface to your code. You are committing to less, but free to do more than you commit to. This seems like the better choice: you’re not tied down because you’ve made fewer commitments. In the future if you need to change something, you have more freedom to make more changes without using a different interface.

If you are consuming a collection of elements, then the opposite is true. If you commit to only needing an `IEnumerable` then you can’t decide later you now need more – like deciding you need to mutate it. You can demand more and use less of what you demanded, but you can’t demand less and use more than you asked for. From this perspective, even if your code only relies on the implementation of `IEnumerable` it seems better to expose the need for `ICollection` in its interface because it doesn’t tie it down as much.

This strikes me to be very similar to producers and consumers in the real world. The cost of producing a product benefits from committing to as little as possible, while the “cost” of consuming a product benefits from demanding as much as possible. But there is also another factor: a consumer who demands too much may not be able to find the product their they’re looking for. That is, more general products can be used by more people. In software terms: a function or class with very specific requirements can only be used in places where those very specific requirements are met – it’s less reusable.

Is it Subjective?

Perhaps this all comes down to a matter of taste or circumstance. Code that’s very “needy”2 is less reusable, but easier to refactor without breaking the interface boundaries. Code that’s very “generous”3 is more reusable but more restricted and brittle in the face of change.

I personally prefer to write code that is very generous. That is, given the same block of code, I write my interfaces to the code such that I consume the most general possible types but produce the most specific possible types. In many ways this seals the code – it’s very difficult to change it without breaking the interface. But on the other hand, it makes different parts more compatible and reusable. If I need to make a change, I either change the interface (and face the consequences), or I simply write a new function with a different interface.

I also think that it may be better to write very generous interfaces to code that is unlikely to change and will often be reused, such as in libraries and frameworks. Domain-specific code which is likely to change across different versions of the program may benefit from being more needy.

MultiDictionary

Although this whole discussion is not specifically about the `MultiDictionary`, the idea was seeded by my encountering of `MultiDictionary`. So lets spend a moment considering how I might have implemented `MultiDictionary` myself.

Firstly, most of this discussion so far has been centered around how you would define the interface, or function signature, of a piece of code that either produces or consumes generic types. I haven’t really gone into what that code should actually look like, but rather what its interface should look like given that the code already has intrinsic demands and guarantees. If we apply this to the `MultiDictionary`, then it probably has its most generous signature already. The code already implements `ICollection`, so it’s generous to expose it in its interface. The dictionary itself doesn’t really consume any interfaces beyond it’s basic insert and delete functions, so we can’t talk about its generosity as a consumer4.

On the other hand, if I had written the code myself I possibly would have only exposed `IEnumerable`. Not because I write needy interfaces5, but because it just seems like a better representation of the underlying principle of a multi-dictionary. As the code stands, there are two different ways of adding a value to the dictionary: one by calling the `add` method on the dictionary, and one by calling the `add` method of the collection associated with a key. This duplication strikes me as flaw in the abstraction: the `MultiDictionary` is probably implemented using a `Dictionary` and some kind of collection type, such as a `List` (I imagine `Dictionary<TKey, List<TValue>>`). It could just as easily, although perhaps not as efficiently, have been implemented as a `List` of key-value pairs (`List<KeyValuePair<TKey,TValue>>`). Would the author still have exposed `ICollection<TValue>`? Probably not. This strikes me as a leaky abstraction in some ways, and I would probably have avoided it. In my mind, exposing `IEnumerable<TValue>` is a better choice in this particular case. The neediness would actually simplify the interface, and make it easier to reason about.

Perhaps this is about not saying more than you mean. You define upfront what you mean by a `MultiDictionary`, and then you describe exactly that in the interface, before implementing it in code. If you mean it to be a set of collections, then `ICollection` is great. If you mean it to be a `Dictionary` which happens to support duplicate keys, then `ICollection` seems to say too much.

 


  1. For the remainder of the post I will use the term `IEnumerable` to mean `IEnumerable<T>`, and the same for `ICollection` 

  2. “Needy” = produces very general types but consumes very specific types 

  3. “Generous” = produces very specific types but consumes very general types 

  4. Actually, I could argue that since `ICollection` is mutable and accepts data in both directions, the exposing of `ICollection` makes it a consumer in a way that `IEnumerable` wouldn’t, but I won’t go into that 

  5. Perhaps a better word here would be stingy 

The speed of C

The speed of C

Plenty of people say C is fast [1][2]. I think C is faster than other languages in the same way that a shopping cart is safer than a Ferrari.

Yes, a shopping cart is safer.

… but it’s because it makes it so hard to do anything dangerous in it, not because it’s inherently safe.

I wouldn’t feel comfortable in my shopping cart rattling around a Ferrari race track at 150 mph, because I’d be acutely aware of how fast the tarmac is blazing past my feet and how inadequate my tools are to handle things that might go wrong. I would probably feel the same way if I could step through the inner workings of a moderate Python program through the lens of C decompiler.

To put it another way: almost everything that makes a program in language XYZ slower than one in C probably uses a feature which could you could have synthesized manually in C (as a design pattern instead of a language feature), but that you chose not to simply because it would have taken you painfully long to do so.

I bet that you could probably write C-like code in almost any language, and it would probably run at least as fast (with a reasonable optimizer). You could also drive a Ferrari as slow as a shopping cart and it would be at least as safe. But why would you?


[1] 299 792 458 m/s
[2] the_unreasonable_effectiveness_of_c.html