# Async in C

C# has an amazing feature called async, which we’ve talked about many times before on this blog, which allows a programmer to write functions that are non-blocking, without needing to use threads. What would it look like to have async functionality in C?

I’ve been working on an experimental “mini-programming-language” which does just that. It doesn’t work the same way as in C#, because the needs in C are completely different. In C you don’t want all the hidden overhead costs that exist in C# related to setting up tasks and delegates and execution context. In C, things should be more or less as they seem.

## What does it look like?

In this experimental language, you can declare functions as async, to say that they don’t complete immediately in a blocking fashion, but instead may complete asynchronously. When functions are declared async, some interesting things happen. One thing is that, in the context of the async function, the word return actually refers to a function, which can be called and saved like any other function. For example, here is an async function foo which returns the value 1.

Of course this function actually returns synchronously, even though it’s declared async. The only difference is that it’s returning using continuation passing style instead of a direct return. But using this feature we could actually delay the return to another point in time:

Now we’ve saved the return continuation and only triggered it when some event returns. We discussed last time what a continuation might actually look like in C, so this week we’ll just elide the type details and say that a continuation is of type Continuation<T>, where T is the return type of the function calling the continuation. Values of this type are each physically the size of a single pointer, and can be executed using the same syntax as a function.

Now comes the interesting bit. Say we have a function, bar, which calls foo. In this experimental language, you can simply define bar like this in this experimental language:

Now clearly bar must be asynchronous as well, since it calls foo, and depends on the result of foo before it can continue. But the magic is that we don’t need to declare the asynchrony explicitly. The experimental language compiler not only infers the asynchrony, but does the corresponding conversion to CPS automatically.

This is more than just a minor convenience. Imagine code like this:

This is a relatively simple function, but if we had to write the asynchrony out explicitly in C we would have code like the following1:

This is beginning to look like unstructured code. The for-loop construct is now completely hidden, and looks more like the old days of conditional-branch-and-jump. We’re also worrying about things that the compiler really should be sorting out for you. Like passing in the return address, passing in a space for storage of local variables, etc. These are all things you generally don’t have to worry about, so why now?

The experimental language I’m working on handles all of this asynchrony automatically. If the above examples are anything to go by, then certain types of code will be reduced to a quarter of their size and be orders of magnitude easier to read, if they were written in this experimental language instead of C. I would perhaps go as far as saying that a fair amount of multithreaded code could instead be written in this async style to run on a single thread, and would as a result be much easier to reason about.

1. again, using the continuation style defined in my last post

# Sequences: Part 5

Last time, I talked about push and pull regarding sequences. We saw that it’s more convenient to write code that pulls from its inputs and pushes to its outputs. We took a look at C#’s generators, and how they enabled us to write sequence-processing functions in this way, without the need for intermediate buffers.

Let’s quickly recap generators. A generator in C# looks like a traditional function (with a signature that returns a sequence), but it can push values to the caller using the special syntax yield return, which essentially puts the generator function “on hold” until the consumer/caller asks for the next value1:

The two parties involved  here are the generator and the caller (which I’ll call the consumer since the generator is a producer).

When the consumer asks for the next value in the sequence, the generator function is temporarily “resumed”, long enough to produce the next value of the sequence. Last time we drew an analogy with freezing time to explain why it’s easier to write the generator code now that it thinks it’s pushing values to the consumer.

But it’s important here to note who is being paused an who is being resumed. When the compiler is producing IL for the consumer function and the generator function, it is the generator that gets reorganized into a form where it can be paused and resumed (it gets converted into a class which implements the IEnumerable<T> "pull" interface).

But what would happen if the next item in the sequence just wasn’t available yet. If we go back to last week’s C example of reading input from the user by pulling values from getchar (or Console.Read in C#), you can see that generators wouldn’t fix the conflict between push and pull in that case.

Let’s simplify things a bit to investigate further. Instead of considering a whole sequence of items, let’s say that there’s just one item. We can pull the item from somewhere by calling a function that returns that item:

When the consumer calls PullFromProducer to fetch the item (an integer), the caller is blocked until the PullFromProducer function returns (synchronously).

The generator syntax in C# still uses this pattern under the covers – the generator function still returns IEnumerable<T>, which as we know from our previous exploration is a pull-based iterator interface.

But what if PullFromProducer simply doesn’t yet have the value that it needs to return? For example, how do we implement the Pull function if it’s to pull from a network connection, which may not have received the value yet?

Like the C# generator makes it possible to pause the producer, wouldn’t it be nice if there was a way to pause the consumer? Obviously we can do this with threads, but wouldn’t it be nice if there was a way to do this without the overhead of threads?

It turns out that in C# there is. C# 5 introduced the concept of async functions. You’ve seen async functions before on this blog, so I won’t go into too much detail. If you aren’t too familiar, I highly recommend reading up about them (here is the MSDN introduction, and I also highly recommend Jon Skeet’s Eduasync series for really getting to know what’s going on behind the scenes2 ).

Using async we can make code that looks like this:

The magic happens in the consumer this time. The consumer function is suspended at the “await” point until the producer pushes the value to the consumer.

To emphasize what’s happening here, let’s look at a slightly different example :

If you run this3 you’ll see the output is something like this:

The interesting thing is the order of the messages. The message line “Consumed: 42” occurs directly after “Pushing value to consumer” rather than after “Consumer is about to await value”, which clearly shows that the consumer is suspended during the intermediate time. But just like with generators, it’s important to realize that the above example does not create any additional threads. Just like with generators, the async functionality is implemented by the compiler by creating a new class behind the scenes.

This solves our problem, right?

Nope.

The problem is that async only works with a single value. We can use it to push a once-off item, but not whole sequences of items.

C# is stuck with two different ways of doing things with sequences. There’s the pull-based approach with IEnumerable<T>. And there’s the push-based approach with IObservable<T> ((I won’t go into IObservable, but if you’re interested take a look at reactive extensions – they echo many of the great features of IEnumerable, such as all their extension methods, but do it for a push-based interface instead of a pull-based one).

What we need is something more like an IAsyncEnumerable<T> interface, which combines task-based asynchrony with a sequential pull-based interface. We also need language support for IAsyncEnumerable<T>, including generators and foreach statements. The combination of generators and IAsyncEnumerable would allow us to have everything we’ve been looking for so far:

• No containers required (sequences don’t have to be in memory before you can work on them)
• Zero buffering overhead (when we can process sequences as fast as they’re produced)
• Completely abstract sequence types (a sequence of user key press events can be as much a sequence as an array of integers)
• Push/pull agnostic (IAsyncEnumerable covers both push and pull cases equally)
• All functions can be written in a form where they both pull input and push output

I apologize to those who aren’t comfortable in C#, since I did originally say that this was going to be a language-agnostic investigation but we landed up in C# anyway. Unfortunately this is because it seems that C# is the only popular language that’s made it this far in providing a solution that fits all these criteria. It just needs to take the last step (although I’ve mentioned in the past that I think that async is flawed in a way that only a new language can cure). C and C++ are simply not well suited to this kind of coding at all.

This brings us to the end of our series on sequences. We started with the most simple C example, which required a buffer on both the input and output side, could not be suspended at all, and provided no abstraction on what the form the input and output could be. We considered ways to improve it, and in doing so investigated how sequence-processing functions can be composed/layered, and the differences between push and pull. At each stage of improvement we ruled out newer and newer old-languages, until we landed up with only a theoretical research-based extension to the latest C#, which seems not to have made it into the mainstream despite it being investigated more than 4 years ago.

1. This is very limited description. For more detail take a look at my previous post and read up about it online

2. Although his series is a bit old now, much of what he said still applies, and it is the most insightful writing I’ve seen on the topic

3. I admit, I don’t have a C# compiler installed right now, so I can’t actually confirm this. If you see a mistake please let me know.

# C#’s async is the wrong way around

C# recently (in the last few years) introduced the asyncawait language feature. If you aren’t familiar with it, I’ll wait here while you go check it out.

Welcome back. Async is awesome right?

So lets say we have some asynchronous method, Foo, and we want to call it from the asynchronous method Bar. The syntax looks like this:

So what does await Foo() mean? To quote from MSDN:

The await operator is applied to a task in an asynchronous method to suspend the execution of the method until the awaited task completes

Seems simple enough. The statement await Foo() calls Foo, suspends Bar, and resumes Bar when Foo comes back with a result. Now lets consider what the synchronousversion of the code would have looked like (assuming Foo2 is the synchronous version of whatever Foo would have done):

So, what does Foo2() mean in this synchronous example? Well, to me it means, “suspend (block) execution of Bar2 while you execute Foo2, and then resume it when Foo2 comes back with the result”. This is obvious: when you call a function you don’t expect the caller to continue before the called function returned. The caller “awaits” the called function. Right?

The natural expectation is for the calling function to wait for the called function to complete. The asynchronous code should actually look like this1:

If we actually intended to use the result as a task instead of awaiting it, we could do something like this:

In the above example, the async call can be thought of as something like a “fork”, because the called function can head off asynchronously to the calling function. Not in the thread or process sense of the word “fork”, since async has very little to do with threads directly, but in the sense that control is in some way branched.

I’m not actually saying that the C# async feature was designed incorrectly. The choice to do it this way in C# is very reasonable considering backwards compatibility with previous versions, compatibility with .NET, and especially backwards compatibility with what people think a “call” actually is. Rather, what I’m demonstrating is that “async” is actually a feature that’s always existed in some way since there were first function calls – the caller always has “awaited” the callee. When people tell you what C#’s await does, they’re in a way really telling you about things that the compiler and runtime do to implement the waiting differently to normal. The way in which the caller is suspended is different between the asynchronous and synchronous code, but that shouldn’t have to change the model that we use to reason about our code.

It strikes me then, that using async and await in C# is much like using inline code in C. They’re both special syntaxes that prompt the compiler to perform specific optimizations. If you write “synchronous” code then the entire stack (and thread context) is preserved and “suspended”, while if you put in a few special words “async”, “await”, and “Task<>” then you cue the compiler to only preserve the current stack frame.

Perhaps for a while this is the only way it will be, but I imagine that slowly it will change so that async vs sync merely becomes an optimization detail. All functions, and property accessors, will work equally with sync or async, so you won’t have to choose between them 99% of the time. You won’t need two separate functions to GET from a URL asynchronously vs synchronously. You simply call the function, and the compiler figures out the best way calling convention (synchronous vs asynchronous).

1. Actually, if I was designing the language from scratch, the asynchronous code would look identical to the synchronous code above, since they both do the same thing

# Compile Time Languages

Let’s say I want to call a stored procedure in a database. The stored procedure is called GetBooksByAuthor, and takes one parameter, “author”, which is a string.

It would be nice to call it like this:

But how do we do that? Currently, there are only a few ways I can think of:

• A code generator could connect to the database, obtain all the stored procedures, and populate a class called “database” with the appropriate functions
• The same thing, but we could interact with the generator to only get a few stored procedures. Of course this means that we have to manually intervene every time we need a new procedure.
• We use a dynamic language, which is able to take the function name as the string “GetBooksByAuthor” at run-time, and dynamically populate the call to the database

The code generators are generally a bad idea. For one thing, they generate code (and normally pretty bad code), which just clutters up the real code. They also require a separate step. Before we code against a new stored procedure (or a changed one) we need to (re)generate the code that proxies for the stored procedure. This can’t be done automatically, since it happens before we code (otherwise the auto-completion and wonderful things like that wouldn’t work).

Also, in the case where we don’t do a brute-force fetch-all approach, we also need to specify which procedures we want to bind to. This list of procedures is probably stored with the code generator and not with the code that actually calls the function. This is terrible for code locality. Often the generated code will be part of the data-access-layer of the program, while the code that actually uses it will be somewhere else.

On the other hand, if we use a dynamic language, we lose quite a few things:

• Static type checking
• Performance
• Auto-completion and other IDE assistence

But what if we could get the best of both worlds? What if we could use “dynamic” functions, but execute them at compile time? Consider this example, based on a hypothetical language which is related to C#:

In the above example, it would pretty much just “work” in C# if you took out the words “meta” and “static var” – although the database.GetBooksByAuthor would instead look like database.Call(...). The word “meta” would simply tell the compiler to translate calls from the form database.GetBooksByAuthor(...) into calls of the form database.call("GetBooksByAuthor", ...). This is nothing interesting, and has nothing to do with compile time vs runtime – it should work the same either way (in this hypothetical language).

Why are there two connection strings? Well, you might not notice it, but normally there are two connection strings. You use one connection string in your development/build for your code generator or IDE, and often a different connection string (provided dynamically as a configuration) for your production environment. The above program just makes it more obvious – and it also puts them in the same place.

This example would work perfectly fine at run-time. But the interesting thing is what happens if you use partial-evaluation to specialize the Database class at compile time. In particular, I think the runtime program could be simplified by the compiler down to the following:

Moreover, since a lot of it can be evaluated at compile time, the IDE could also provide auto-completion and other information. For example, when you type “database.”, the IDE can provide a list of stored procedures (by executing user code).

## Taking it further

This could be used for more than database access. It could be used to provide linking to files of completely different languages, even if they exist in the same program. You wouldn’t need compiler extensions to be able to link to C++, all you would need is a library which knows how to call a C++ ABI.