Mutating Types

Mutating Types

This is an idea I have for a feature for a statically typed language which would make the type system more expressive. I call it "mutating types", and in a nutshell the idea is that the act of mutating a variable can also change its type.

I’d like to preface this article by saying that I’ve come up with this idea completely on my own, meaning that if it appears that I’m copying some other academic work it is completely unintentional, and please let me know if I’m doing it. Likewise, please don’t use this idea without acknowledging that it’s mine, and please contact me if you do intend to use it, even if you modify it somewhat.

I apologize in advance for how long this "post" is. I felt it was important to start with the problem description to put the idea into context, and to end with some explanations of how this might be used in practical situations.

The Problem

Let me illustrate the scenario with an example. I’m going to add the feature to C++ in this example. Consider the following types:

struct Int{
    int value;
};

struct Null {
};

struct NullableInt {
    bool isNull;

    union
    {
        Int intValue;
        Null nullValue;
    } value;

    void set(Int i) { value.intValue = i; isNull = false; }
    void reset() { value.nullValue = Null(); isNull = true; }
};

So, a NullableInt is a tagged union which can either hold an integer with a value, or null. I’ve been unnecessarily explicit about the types for Int and Null, because conceptually they are subtypes of NullableInt. Now lets say we have the following code which defines a new Int with a null value, and then calls a function to give it a new value, and then prints out the value that it has.

int main() {
    NullableInt myInt;
    myInt.reset(); // Initialize myInt to null value

    // Call a function to set the value to 42
    setValue(myInt, 42); // function defined below

    // Print out what the value is
    if (!myInt.isNull)
        std::cout << "myInt is " << myInt.value.intValue.value << std::endl;
    else
        std::cout << "myInt is Null" << std::endl;
}

void setValue(NullableInt& i, int newValue) {
    Int newInt;
    newInt.value = newValue;
    i.set(newInt);
}

The function setValue mutates a NullableInt. The particular mutation is just to assign it to a new value, but in general imagine that the mutation might be more complicated.

What I want to point out is on line 9: we have to check whether the value is null or not, even though we know the value isn't null because of the behavior of setValue. Well, obviously we don't need to check the value, and probably most people wouldn't, but there is nothing about the type signature of setValue that says it won't set it to null. In other words if you just assume that it's not null, you're relying on the maintainers of the setValue function to keep that behavior consistent, and not the compiler. This is a reasonable assumption in this case, but perhaps a bad thing to rely on in general.

How would we guarantee, through the type system, that setValue always "outputs" an Int and not a Null?

Option 1: Avoid Mutation

Well, the functional-programming bias in me might suggest the following:

Int setValue(NullableInt& i, int newValue) {
    Int newInt;
    newInt.value = newValue;
    return newInt;
}

In other words, we avoid mutation altogether, and simply calculate a new value based on the old and return it (in this case the new value is not dependent on the old, but in general it could be, which is why I added i as a parameter). The returned value is of the more-specific type Int and not NullableInt, echoing the intention of the function not outputting a nullable value.

This is elegant, but it's not the same thing. For one thing, there are life-time issues. The original value is not destroyed until after the function returns (presumably by the caller), and it's impossible to specify otherwise. In a language with deterministic destruction this is just simply not the same thing, and it has consequences in performance. Quite often, mutations are small changes to a big thing: a state machine changing state, a list acquiring a new item, etc. Creating a whole new "big thing" is just a waste of time1.

Option 2: Mutating Types

So far I've explained the problem, and also how an immutable style doesn't fix the problem. Now for a real solution, the crux of this post: lets define a mutating type.

I don't know what syntax is best for this, but for the sake of being explicit I'm going to define a mutating parameter type as (T1->T2), where T1 is the type passed into the function, and T2 being the type passed out. It is implied that a mutating parameter type must be pass-by-reference. For example:

void setValue2;
    foo(x);
    foo(x); // err... problem!
}

Here, the fact that foo takes an auto_ptr as a parameter means that it implicitly takes ownership of the pointer. However, the ownership semantics are only a runtime construct in this case, because the compiler doesn't even warn when bar tries to use the value that it just gave away3. But we could rewrite this using mutating types:

void foo((int* -> void) x) {
    cout << *x << endl;
    delete x; 
}

void bar() {
    int* x = new int(42);
    foo(x);
    foo(x); // compiler error: foo expects type `int*` but instead got type `void`.
}

What this example is showing is that the act of passing ownership actually changes the type, so that the type can indicate ownership. Since x is deleted in this case, the new type is void to indicate that the type has no possible values. This not only removes a whole class of possible bugs, but also could potentially improve performance by avoiding unnecessary value-checks.

2. Mutating conditionals

I won't go into too much detail, but imagine that we could eliminate type-casting by instead using mutating conditional statements. That is, if-statements which actually alter the type of their operands.

struct Animal { };

struct Cat : Animal {
    void meow();
}

struct Dog : Animal {
    void bark();
}

void foo(Animal& animal) {
    if (animal is Cat)
        animal.meow(); // the conditional altered the type to `Cat`,
    else if (animal is Dog)
        animal.bark(); // so no type-casting necessary
}

There are some open questions here. For example, what does the is operator look like in this hypothetical language? But you can see possibilities!

3. Once-off subscription/continuation

Here is another situation I run into quite often. In an environment with limited resources you can't always afford to have multiple subscribers to an event. It creates a nice inversion of control to have an event in the first place, instead of a direct function call, but for the sake of performance you aren't going to maintain a dynamic structure to handle the possibility of multiple subscribers because you know there's only one.

typedef void (*Action)(); // Function basic pointer type

struct Event {
    virtual void invoke() = 0;
};

struct UnsubscribedEvent {
    virtual void invoke() { }
};

struct SubscribedEvent : public Event {
    Action _func;

    // Constructor
    SubscribedEvent(Action func) : _func(func) { }

    // Override invoke
    virtual void invoke() { _func(); }
};

void subscribe((UnsubscribedEvent -> SubscribedEvent) event, Action func) {
    delete event;
    event = new SubscribedEvent(func);
}

struct ModuleXStuff
{
    UnsubscribedEvent someEvent;
}    

void initialize()
{
    ModuleXStuff moduleX = initModuleX();
    initModuleY(moduleX.someEvent);
}

void initModuleY((UnsubscribedEvent -> SubscribedEvent) event) {
    subscribe(event, &moduleYsomeEvent);
}

void moduleYsomeEvent() {
    cout << "Some event fired" << end;
}

I won't explain this example too much, since if you've read this far you probably already understand it. The idea is that module Y, whatever that is, needs to subscribe to an event from module X. This is achieved by module X exposing an UnsubscribedEvent, which module Y can mutate into a SubscribedEvent by subscribing to it. Subscribe should actually be a member function of UnsubscribedEvent, but this-mutating member functions is beyond the scope of this post.

The same logic can be applied to continuation functions:

void someLongProcess((UncalledContinuation -> CalledContinuation) onDone) {
    // ...
    // result = ...;
    call(onDone, result); // call the continuation

    call(onDone, result); // Type error, cannot call "CalledContinuation"
}

This is useful if the continuation has some cleanup work to do. If someLongProcess forgets to call the continuation then there will be a type error. And if someLongProcess double-calls the continuation then there will be a type error.

Conclusion

I haven't fully described it here, but mutating types could have a whole host of other uses, and I can't even begin to go into all of them. Most are related to resource and state management, which are particularly important in low level, imperative languages such as C. They are not restricted to "function parameters", although the description of a more general system gets a little complicated to put in a blog. I've only touched the surface, but perhaps I will go deeper in a future post.

Mutating-types embrace the stateful, mutable world that is low-level embedded- or systems-programing. They open up an avenue not just for creating more readable software, but also for making it safer and more performant. I think it would be a great addition to many languages.


  1. Of course, I'm all for the idea of writing simple code and having the optimizer recognize where it can reuse memory and so on, but an ideology of C is that we can have more predictable guarantees about performance, and mutation is one way of achieving that 

  2. NullableInt->Int) i, int newValue) { Int newInt; newInt.value = newValue; i = newInt; } int main() { NullableInt myInt; myInt.reset(); // Initialize myInt to null value // Call a function to set the value to 42 setValue(myInt, 42); // Now, myInt is actually of type Int, and not NullableInt // Print out what the value is std::cout << "myInt is " << myInt.value << std::endl; }

    If you try to interpret the example as straight C++ it may be a little confusing, since Int is not actually a subtype of NullableInt, but this is a different issue. The point is that the function setValue mutates one of it's arguments, and puts restrictions on this mutation by specifying the mutating type.

    Usefulness

    The previous example is pretty contrived. Is this really a useful feature? I'll give you a few examples from real-life where this would be useful.

    1. Resource ownership

    In C++ you can specify pointer ownership using smart pointers. But there's something I've found missing from smart pointers: the ability to specify at compile-time that something acquires or loses ownership.

    void foo(auto_ptr x) {
        cout << *x << endl;
    }
    
    void bar() {
        auto_ptr x(new int(42 

  3. Note that C++11 move semantics don't really address the problem 

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.