Practical Dependent Types

Practical Dependent Types

This morning I found another practical need for dependent types. The details are confidential, but the situation is a common one so I’m not giving anything away by explaining the general situation.

It’s quite simple: we have an embedded device with a flat fixed “file system”. There are no file names, just 10 fixed-size “slots” where a file can go. Each slot is reserved for a particular module of the program to store its persistent data, and has a binary format that is only useful to that module. For example, slot 3 might hold a configuration for calibrating power module. [1]

So how does the power module read its configuration from the file system? A simple implementation is something like this:

enum FileId
{
    ...
    POWER_SYSTEM_FILE = 3,
    ...
};

void* get_file_data(FileId id); // implemented somewhere else

// The structure of the file the power system uses
struct PowerSystemFile { ... }; 

void power_system_init()
{
    PowerSystemFile* file = 
        (PowerSystemFile*)get_file_data(POWER_SYSTEM_FILE);
    // Use the file to apply the power configuration
}

 

The suspect piece of code is the cast from void* to PowerSystemFile*. I think its generally a code smell to have to cast like this. What may be a clue to the validity of this smell is that if the language were completely type-safe then this cast would be either completely disallowed or it would result in a runtime check (and possibly a runtime failure). And in a language like C/C++, a static cast like this is a potential entry point for catastrophic failure with some very strange symptoms. So the question is, how could this be implemented to not require a cast, so that the compiler can verify that we’re doing the right thing?

This to me is an ideal situation for dependent typing. I’m going to make up a fictitious language that might express the things needed to make this example work the way I want it to.

struct FileSlot; // FileId is the same as previously declared
{
    switch (id)
    {
        ...
        case POWER_SYSTEM_FILE: PowerSystemFile file;
        ...
    }
};

FileSlot* get_file_data(FileId id); // Implemented elsewhere

void power_system_init()
{
    FileSlot;* slot = get_file_data(POWER_SYSTEM_FILE);
    PowerSystemFile* file = &slot->file;
    // Use the file to apply the power configuration
}

 

Let me briefly explain the syntax of the above imaginary language in case it’s not completely self-evident. The struct FileSlot<> take an integer argument. For those who know C# or Java, the syntax should look familiar, except that in this case FileSlot doesn’t take a type parameter but an “enum” parameter. The switch statement effectively creates a tagged union.. depending on the value of id (the tag), a different member of the union will be accessible. In this case though, the tag is not part of the union structure, rather it is passed in as a type parameter.

The especially interesting piece of code is the declaration of get_file_data, which returns a FileSlot<id> which is a type dependent on the parameter value id. This is really the dependent type, since it’s the first time we’re saying that the parameter of FileSlot can be a variable.

For the sake of comparison, I’m going to implement this in C++ as well:

template <FileID id>
struct FileSlot;

template <>
struct FileSlot<POWER_SYSTEM_FILE> // Template specialization
{
    PowerSystemFile file;
};

template <FileID id>
FileSlot<id>* get_file_data(); // Implemented elsewhere

void power_system_init()
{
    FileSlot<POWER_SYSTEM_FILE>* slot = get_file_data<POWER_SYSTEM_FILE>();
    PowerSystemFile* file = &slot->file;
    // Use the file to apply the power configuration
}

 

The first thing I’d like to point out is that you can actually do this in C++. But there are some important differences. The most practical difference in my mind is that the C++ version is purely a compile-time concept. The id is not a parameter of get_file_data, it’s a template parameter that is used to instantiate an instance of get_file_data for each id that it’s used with. If you only knew the id at runtime then this function wouldn’t work at all (for example if you needed a function that loops through all the files to do file synchronization).

In my hypothetical language the FileID id parameter (of both the type and the function) is a real runtime variable. In the case where the id is constant, the compiler could choose to do a constant propagation optimization to emit exactly the same binary as the C++ version, so I don’t see this as a compromise on performance. On the other hand the compiler could keep the runtime version if it was actually more efficient – which in this case it probably is since our model has a flat file structure where get_file_data probably just fetches data from the id location in the file medium.

As a final note, I don’t actually like any of these solutions completely. The reason is code cohesion. This example requires that we define all possible values of FileID in one place (the enum declaration), and correspondingly all possible types (or template specializations) of FileSlot need to be defined in one place (although they may be physically separated by using #include, the coupling is still there). This has some benefits (if you need to know all the file types then you know where to look), but it smells to me because of its lack of modularity. This isn’t a downfall of dependent types, it’s just a downfall of this particular design.

Anyway, I hope that by giving you a practical example of where a dependent type may be useful that I’ve set off a spark in your mind. And I hope I’ve made you feel a little bit more uncomfortable in your most used language, like I am every day. Enjoy the bitter taste now that you can taste it, because without it you wouldn’t know to move to fresher pastures. [2]


[1] This whole example is completely inaccurate relative to the real situation, but it works well as a simplified model that explains the essence of what the requirement is.

[2] If you can’t taste it, don’t worry. Just wait 10 or 20 years and you’ll be jostled along with the rest of the herd.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.