Aren’t Subtype and Parametric Polymorphism the Same?
There are apparently two common types of polymorphism in software: subtype polymorphism and parametric polymorphism. I think that these are really the same thing, and I can’t any good reason why modern languages separate the two. I’m going to explain my reasoning in this post.
Firstly, what is are subtype polymorphism and parametric polymorphism? Briefly, subtype polymorphism is when you use class or interface inheritance write more generic code1. The generic code works with the interface or base class, and therefore implicitly works for all subtypes or interface implementations. Parametric polymorphism is when you code using generics or templates – the code has a type parameter which can be substituted with a concrete type for each use of the code.
But what are the claimed differences between the two? Well, I’m going to demonstrate by an example. Lets say we have an abstract type, IPet
, which has two implementations, Cat
and Mouse
, each of which implements the IPet’s Stroke
function (I’m using C# for the examples):
interface IPet { void Stroke(); } class Cat : IPet { void Stroke() { Console.WriteLine("Purr"); } } class Mouse : IPet { void Stroke() { Console.WriteLine("Squeak"); } }
Now lets say we want a function to repeatedly stroke a pet a given number of times, but abstracted from what type of pet it actually is. We can implement this using subtype polymorphism like this:
void StrokePet1(IPet pet, int count) { for (int i = 0; i < count; i++) pet.stroke(); }
And here is an implementation using parametric polymorphism:
void StrokePet2<T>(T pet, int count) where T: IPet { for (int i = 0; i < count; i++) pet.stroke(); }
So what’s the difference? Well in terms of the implementation of StrokePet1
and StrokePet2
I see no difference. Both function implementations read like this:
Given a pet of any type: Loop count times: Stroke the pet
Okay, well, in C# there is a nontrivial difference. Particularly in the way you call the function. With parametric polymorphism, you need to know the type T
upfront when you call StrokePet2
2. The same thing applies generally. For example, when you declare a List<T>
, T
needs to be known upfront when the declaration is made (or it needs to be a type parameter in itself)
There is another difference in C#: the parametric type T
can be used in multiple locations to ensure that each declaration has the same type. Consider we have a function that evaluates two pets of the same species, and returns the “winner” (by some unknown criteria):
T ChooseWinner<T>(T left, T right) where T: IPet;
And now lets try implement this using subtype polymorphism:
IPet ChooseWinner(IPet left, IPet right);
Oh dear.. there’s no way to specify that left
is the same pet type as right
or the return type. This is the same problem we would have with using non-generic container classes: there’s no way to specify at compile time that some abstract queue should only contain elements of a certain type:
// This queue is supposed to only contain IPets, but how do we specify it? Queue myQueue = new Queue(); myQueue.Enqueue(new Mouse()); myQueue.Enqueue(new Cat()); myQueue.Enqueue(new object); // Ooops. Why can't the compiler catch this? IPet first = myQueue.Dequeue(); // why won't it allow this?
So wait, am I saying they’re actually different? No. I’m saying that in C#, subtyping polymorphism is just parametric polymorphism but with less static information.
I don’t think this is a shortcoming of subtype polymorphism – it’s a shortcoming of the language to not be allow us to fully express the relationships between between types used in a class or function. There’s no way to specify that a C# Queue
container has to be homogeneous without using generics, but that just means that C# subtyping polymorphism isn’t as expressive as its parametric polymorphism. Conversely, parametric polymorphism in C# is more strict because we need to supply more static information. Consider the following:
Queue myQueue = new Queue(); // Here myPet is a mouse, but we haven't told the compiler it must be object myPet = new Mouse(); myQueue.Enqueue(myPet); // Why won't it allow this?
So really, it’s all about the balance between information we keep in our heads vs information we give to the compiler for static type checking.
So What?
What’s the point? Does anyone care that there may be some airy-fairy overarching concept that ties parametric and subtype polymorphism? I would say yes, and the reason is about code re-usability and performance.
Regarding performance, in C# there is a difference in the performance of using each type of polymorphism. In the case of parametric polymorphism, the underlying code can be specialized for the provided type argument, making it more efficient. Lets say we have an INumber
interface, (like Java’s Number
class). Should we really have to decide between the following two implementations:
INumber Add(INumber first, INumber second) { return first + second; } TResult Add<TFirst,TSecond,TResult>(TFirst first, TSecond second) where TFirst: INumber where TSecond: INumber where TResult: INumber { return first + second; }
This seems like the sort of thing the compiler should sort out, akin to inline optimization there should be specialization optimization. Inlining of functions involves taking their parameters and substituting concrete arguments to specialize it for a specific call site – sound familiar?
Regarding reusability, note that functions with generic parameters are not interchangeable with functions that have virtual parameters, and so to some extent the designer of a function or class needs to think ahead about how much of it to make generic. But why? Why can’t the language make generics seamless: the coder should specify exactly the required details for the code, and no more. Everything else should just be “generic” by default. Need a number variable…? Just use INumber
, not int
or short
. The compiler can “inline” (specialize) specific types as they become known. Whether the compiler generates an actual abstract object with dynamic dispatching at runtime, or just specializes the code to the concrete type, should just an optimization issue and not the concern of the programmer.
As a side note, there are languages which make generics much less invasive. For example, F# is a language where I think it really is feasible to make all code generic (except at the root). But I think we need this capability in more popular OOP languages like C# and Java.
In conclusion, I think that parametric and subtype polymorphism are actually the same thing in a statically typed language. In terms of program operation, subtype polymorphism is just like a parametric polymorphism that’s missing a few static features. Popular modern languages should use this make it easier to write reusable, high-performance code.