Month: July 2013

Aren’t Subtype and Parametric Polymorphism the Same?

Aren’t Subtype and Parametric Polymorphism the Same?

There are apparently two common types of polymorphism in software: subtype polymorphism and parametric polymorphism. I think that these are really the same thing, and I can’t any good reason why modern languages separate the two. I’m going to explain my reasoning in this post.

Firstly, what is are subtype polymorphism and parametric polymorphism? Briefly, subtype polymorphism is when you use class or interface inheritance write more generic code1. The generic code works with the interface or base class, and therefore implicitly works for all subtypes or interface implementations. Parametric polymorphism is when you code using generics or templates – the code has a type parameter which can be substituted with a concrete type for each use of the code.

But what are the claimed differences between the two? Well, I’m going to demonstrate by an example. Lets say we have an abstract type, IPet, which has two implementations, Cat and Mouse, each of which implements the IPet’s Stroke function (I’m using C# for the examples):

Now lets say we want a function to repeatedly stroke a pet a given number of times, but abstracted from what type of pet it actually is. We can implement this using subtype polymorphism like this:

And here is an implementation using parametric polymorphism:

So what’s the difference? Well in terms of the implementation of StrokePet1 and StrokePet2 I see no difference. Both function implementations read like this:

Okay, well, in C# there is a nontrivial difference. Particularly in the way you call the function. With parametric polymorphism, you need to know the type T upfront when you call StrokePet2 2. The same thing applies generally. For example, when you declare a List<T>T needs to be known upfront when the declaration is made (or it needs to be a type parameter in itself)

There is another difference in C#: the parametric type T can be used in multiple locations to ensure that each declaration has the same type. Consider we have a function that evaluates two pets of the same species, and returns the “winner” (by some unknown criteria):

And now lets try implement this using subtype polymorphism:

Oh dear.. there’s no way to specify that left is the same pet type as right or the return type. This is the same problem we would have with using non-generic container classes: there’s no way to specify at compile time that some abstract queue should only contain elements of a certain type:

So wait, am I saying they’re actually different? No. I’m saying that in C#, subtyping polymorphism is just parametric polymorphism but with less static information.

I don’t think this is a shortcoming of subtype polymorphism – it’s a shortcoming of the language to not be allow us to fully express the relationships between between types used in a class or function. There’s no way to specify that a C# Queue container has to be homogeneous without using generics, but that just means that C# subtyping polymorphism isn’t as expressive as its parametric polymorphism. Conversely, parametric polymorphism in C# is more strict because we need to supply more static information. Consider the following:

So really, it’s all about the balance between information we keep in our heads vs information we give to the compiler for static type checking.

So What?

What’s the point? Does anyone care that there may be some airy-fairy overarching concept that ties parametric and subtype polymorphism? I would say yes, and the reason is about code re-usability and performance.

Regarding performance, in C# there is a difference in the performance of using each type of polymorphism. In the case of parametric polymorphism, the underlying code can be specialized for the provided type argument, making it more efficient. Lets say we have an INumber interface, (like Java’s Number class). Should we really have to decide between the following two implementations:

This seems like the sort of thing the compiler should sort out, akin to inline optimization there should be specialization optimization. Inlining of functions involves taking their parameters and substituting concrete arguments to specialize it for a specific call site – sound familiar?

Regarding reusability, note that functions with generic parameters are not interchangeable with functions that have virtual parameters, and so to some extent the designer of a function or class needs to think ahead about how much of it to make generic. But why? Why can’t the language make generics seamless: the coder should specify exactly the required details for the code, and no more. Everything else should just be “generic” by default. Need a number variable…? Just use INumber, not int or short. The compiler can “inline” (specialize) specific types as they become known. Whether the compiler generates an actual abstract object with dynamic dispatching at runtime, or just specializes the code to the concrete type, should just an optimization issue and not the concern of the programmer.

As a side note, there are languages which make generics much less invasive. For example, F# is a language where I think it really is feasible to make all code generic (except at the root). But I think we need this capability in more popular OOP languages like C# and Java.

In conclusion, I think that parametric and subtype polymorphism are actually the same thing in a statically typed language. In terms of program operation, subtype polymorphism is just like a parametric polymorphism that’s missing a few static features. Popular modern languages should use this make it easier to write reusable, high-performance code.

  1. The exact details actually depend on the language, so this is a bit of an over simplification 

  2. Of course if you don’t know the derived type upfront C# will happily substitute the base type into the generic, in which case you’re then using subtype polymorphism instead