Month: March 2016

Why Types Introduce Coupling

Why Types Introduce Coupling

Last time I gave a long post about why C is complicated and JavaScript can make your code simpler and less buggy. Today I want to touch on another idea: that statically typed languages introduce a level of coupling in your program that’s unnecessary and makes maintenance harder.

Imagine that I ask you to write a function to simply iterate through a list of integers and output them to the console. In C, you might write the following code:

void printListToConsole(int* list, int count) {
    for (int i = 0; i < count; i++) {
        printf("%i\n", list[i]);
    }
}

But wait a second. I didn’t tell you that the list was going to be presented a contiguous array! What if it was a linked list? Then the code might look like the following, assuming that we had previously declared the types LinkedList and LinkedListNode appropriately:

void printListToConsole(LinkedList* list) {
    LinkedListNode* node = list->first;
    while (node) {
        printf("%i\n", node->value);
        node = node->next;
    }
}

The problem in both of these is that the function we’ve written is intimately coupled to the type of the parameters it accepts. There’s no way in C to express the idea of “I want my function to accept any list-like type”. This means that we cannot reuse the above code in other circumstances where the type is even slightly different.

C++

In C++, we can address this problem to some extent by using templates. We could write the above function something like this in C++:

template <typename TList>
void printListToConsole(TList& list) {
    for (auto i = list.begin(); i != list.end(); i++) {
        printf("%i\n", *i);
    }
}

There are still some of the same issues in this code. For example, do we pass an iterator pair to the function (as many STL functions do), or do we pass the container itself?

To me this is a non-solution. For one thing, it’s saying that if you want to write reusable code that’s decoupled from the types of its parameters, you need to write everything in your program as templates. This is clearly not an option for many reasons, including compilation time, error messages, complexity, etc.

All that C++ templates give us is a taste of a world without explicit types. The above C++ function is decoupled from the caller simply because it does not state what type it expects.

It’s true that there are languages that perform type inference, and allow you to write pretty generic code without losing static type safety, but as with C++, the error messages and language complexity associated with this type of coding makes it difficult for beginner-to-intermediate level programmers.

C#

At first glance, you may think that higher level statically-typed languages solve this with better abstraction mechanisms. Consider the C# code for the above function:

public void printListToConsole(IEnumerable<int> list) {
    foreach (int i in list) {
        Console.WriteLine(i);
    } 
}

The above function says that it accepts any argument that is enumerable, meaning anything that can be iterated. This is fantastic from a reusability and code maintenance perspective, so it does address one side of the issue. But there is another issue lurking behind the scenes. One that may be more noticeable to people working in very resource-constrained environments like in embedded firmware.

Although the above C# code does not couple itself to the specific type of integer list, and so can be used throughout the program, it does actually still have a level of coupling at the binary level: the function accepts something that has a binary signature of IEnumerable<int> . It is not the same as a C++ template that is decoupled by having the template instantiated for multiple binary signatures.

This machine-level coupling is invisible to the programmer, which is convenient and important when it comes to managing large and complex codebases. But remember that interfaces as a language feature are normally implemented at the binary level by passing control through several levels of indirection. I won’t go into detail – you can look it up yourself if you’re interested – but here is a brief summary to illustrate my point.

The machine code must first look up a virtual dispatch table from the interface pointer, and then look up the correct entry in the dispatch table corresponding to the method you want to call, and then perform an indirect call through that entry in the dispatch table. The indirect call doesn’t go straight to the function implementation, but first through a thunk that adjusts the this  pointer from the interface’s this  pointer to the underlying object’s this  pointer (which are different because objects can implement multiple interfaces), and then the thunk redirects to the actual function. This is to perform a single function call on an abstract interface.

Although the C# language doesn’t dictate these details, you have to be aware of them anyway. The logical types are still coupled to the binary types, and so the choices you make about how generic you want your code really do still have an effect on the performance and overheads involved in your program.

JavaScript

So what’s different about JavaScript? Simply put, JavaScript doesn’t require you to specify these type details at all.  This not only brings a level of decoupling to the logical program, where you gain benefits of reusability and maintainability, but also unshackles the compiler to theoretically be able to apply a whole new range of optimizations. As compilers are getting better and better, I think this will be more and more the case, and soon the performance of a program will be more related to how few details you dictate, rather than how many, and the performance of JavaScript programs will exceed those of the C and C++.