When defining a generic interface, have you received a hint from Resharper like “The type parameter T could be declared as covariant” (or “contravariant”)? If so, have you then blindly applied the proposed refactoring which decorates your generic parameter with the in
or out
keyword? Like so:
public interface ISomeGenericInterface<in T>
I know I’ve done this a few times before deciding to dig deeper into what these terms actually mean and how they affect my type’s behavior. Type variance is one of the topics developers work with pretty much on a daily basis but don’t always understand it well.
I will discuss what it exactly means for a type to be covariant, contravariant or invariant, how are those terms represented via the in
and out
keywords, the semantics of the so called input and output positions, and how all of this can make our types more flexible and useful.
When I was researching the topic myself, I’ve come across some resources online that get a little bit theoretical and academic. That’s probably because the terms covariance and contravariance have their deep mathematical origin. This is not something I will bother you with here. In this blog post, I’ll try to develop the practical intuition behind the type variance concept with a lot of examples. I hope this would lead to some logical conclusions which would make the formal definitions quite intuitive.
Throughout this article I will work with a very simple class hierarchy:
public class Person { public Person(string name) { Name = name; } public string Name { get;} } public class Student : Person { public Student(string name) : base(name) { } // Student specific fields... } public class Teacher : Person { public Teacher(string name) : base(name) { } // Teacher specific fields... }
Not surprisingly, Person
is the base class with a Name
field, Student
and Teacher
are deriving from it. The discussion in the following sections will refer to those classes for concreteness but all the statements made would be generally applicable.
Let’s start with a warm-up exercise.
By knowing that Student
and Teacher
inherit from Person
, this means that you can substitute Person
with Teacher
or Student
everywhere a Person
is expected. For example, if a method has an input parameter of type Person
, you can always pass an instance of Student
or Teacher
. Also, if a method is declared to return a Person
, the implementation can actually return a Student
or a Teacher
. We’re familiar with OOP principles and the rules of inheritance and polymorphism so this behavior is very natural to us.
Now the question is: what if the method accepts as a parameter not a Person
, but a generic type G<Person>
? Does that mean we can pass G<Teacher>
or G<Student>
?
This question will be directly related to our type variance discussion. By the end of this article you should be able to give an answer and talk confidently on this topic.
Arrays Covariance
Let’s move on with a simple example to start developing our intuition.
Have a look at the following method:
public void PrintNames(Person[] people) { foreach (var person in people) { Console.WriteLine(person.Name); } }
It receives an array of Person
objects as input and writes all their names to the console. But what if instead of an array of Person
, we pass an array of Teacher
or Student
? Like so:
Student[] students = { new Student("John"), new Student("Peter") }; PrintNames(students);
This code compiles, works and makes perfect sense. Both Student
and Teacher
have names so being able to print them by passing Student[]
or Teacher[]
instead of Person[]
makes our method way more useful than if we had provide an exact type match of Person[]
.
With this example in mind you may start thinking that anywhere we expect a generic type of Person we should be able to pass an object with the generic parameter being Teacher or Student. Not really. Let’s look at a different example:
public void Update(Person[] people) { people[0] = new Teacher("Paul"); }
What if we call this method like this?
Student[] students = { new Student("John"), new Student("Peter") }; Update(students); Student firstStudent = students[0];
This code compiles but will throw an exception at runtime. To illustrate the problem in another way, we can simplify the example:
Student[] students = { new Student("John"), new Student("Peter") }; Person[] people = students; people[0] = new Teacher("Paul"); Student student = students[0];
Follow the assignments carefully. The original array is of type Student[]
. However, due to the fact we can substitute Person[]
with Student[]
, on line 6 we get a reference to array of Person
objects. Every Teacher
is a Person
so the statement on line 7 is allowed by the compiler. However, this is the place where we get the exception at runtime:
Attempted to access an element as a type incompatible with the array.
And of course, this is something we would expect. If this code had worked, what would the assignment statement on line 8 do? We would be trying to assign a Teacher
to a Student
. These two types are not compatible so that wouldn’t make sense to say the least. Here is an identical example by Jon Skeet in SO.
Being able to pass an array of Student
or Teacher
where an array of Person
is expected means that arrays are covariant. That was a deliberate decision from the beginning when the language did not support generic types. Although there may be some dangerous consequences as we’ve seen, this behavior brings a lot of flexibility to implement general purpose algorithms like in our PrintNames
method. Some more examples are described here.
Prior to C# 4.0, all generic types were invariant. Back then we could only use exact type match when working with generic interfaces. That was kind of a drastic measure the language designers decided to take to deal with potential problems like the one we’ve just seen with arrays. Luckily, this has changed in realization that in some cases we can relax on the exact match restriction and make out types more flexible. Let’s start exploring what those cases are from covariance perspective.
Covariance
After the arrays example, we can now give a more formal definition for generic types covariance:
A generic type G<T> is covariant with respect to T if G<X> is convertible to G<Y> given that X is convertible to Y
We’ve seen there are cases when it makes perfect sense for a type to be covariant which can make some algorithms more general. But we’ve also seen a counter example where arrays covariance led to an exception at runtime. Let’s take a closer look at the PrintNames
and Update
methods. Here are their definitions once again:
public void PrintNames(Person[] people) { foreach (var person in people) { Console.WriteLine(person.Name); } } public void Update(Person[] people) { people[0] = new Teacher("Paul"); }
Clearly, the PrintNames
method is only reading the array elements. On the other hand, Update
modifies the array. And this is exactly what leads to the problematic behavior. With arrays being covariant, there is no way we can know the exact type of the input parameter – it can be Person[]
, Student[]
or Teacher[]
. So, we can’t safely do updates. In our case, if the exact type of the input happens to be Teacher[]
or Person[]
, the code will work. If it’s Student[]
though, it will throw an exception like we’ve already seen. This behavior is very fragile and it’s a clear example for a violation of the Liskov Substitution Principle. That’s why the C# designers decided to forbid such behavior initially when they introduced generics.
But how about the cases when covariance is useful? If our generic interface contains only read operations with respect to the generic parameter, we know we are pretty much in a safe place. Thus, it was decided to relax on the restrictions starting from C# 4 by introducing the in and out keywords that annotate a type as contravariant or covariant. Contravariant types will be discussed later in this article. Let’s now build some intuition about what it means for a generic interface to have its generic parameter as read-only.
Imagine the T
parameter only appears as a method return type. Like so:
public interface IMyReadOnlyCollection<T> { T GetElementAt(int index); }
That’s a perfect example for a type that can be made covariant. This is simply achieved by adding the out
keyword before the T
parameter:
public interface IMyReadOnlyCollection<out T>
The generic type parameter T
is said to be at an output position in this interface. Output positions are limited to function return values, property get accessors, and certain delegate positions.
After our change, we can now safely substitute IMyReadOnlyCollection<Person>
with IMyReadOnlyCollection<Student>
or IMyReadOnlyCollection<Teacher>
.
If you think for a moment, you’ll probably come up with some “read-only” generic interfaces in .Net. You wouldn’t be surprised that IEnumerator<T>
is one of them:
public interface IEnumerator<out T> : IEnumerator, IDisposable { T Current { get; } }
Logically it’s marked as covariant.
No surprises that IEnumerable
which returns an instance of IEnumerator
is also covariant:
public interface IEnumerable<out T> : IEnumerable { IEnumerator<T> GetEnumerator(); }
Thus far, we’ve spent quite a bit of time on covariance. We haven’t touched contravariance yet so this will be the next topic.
Contravariance
I’ve seen some places that explain contravariance as “the opposite of covariance”. I guess that’s a valid statement and makes a lot of sense but surely wasn’t something that really got me closer to understanding the concept in the beginning. So, following the approach in this article, I’ll try to explain the meaning and applicability of contravariance with a (hopefully)useful example.
Say we have a simple interface for objects comparison:
public interface IMyComparer<T> { int Compare(T x, T y); }
And a concrete implementation for comparing objects of type Person
:
public class PersonComparer : IMyComparer<Person> { public int Compare(Person x, Person y) { return string.CompareOrdinal(x.Name, y.Name); } }
The implementation here just does a lexicographical comparison of the names. However, the exact comparison algorithm is not really relevant to our discussion so don’t waste any energy on that.
Now, consider this method:
public int Compare(IMyComparer<Student> comparer) { var s1 = new Student("John"); var s2 = new Student("Peter"); return comparer.Compare(s1, s2); }
This method creates two students and compares them via the received comparer of type IMyComparer<Student>
.
Let me ask you a question. What if we want to compare Student
objects in the same way we compare Person
objects? Concretely – by comparing their names. We already have an implementation that compares Person
objects by Name
– this is implemented in the PersonComparer
class. Why not just use it for our students comparison, like so:
var personComparer = new PersonComparer(); var comparisonResult = Compare(personComparer); Console.WriteLine(comparisonResult);
This code would not compile and prior to C# 4 there was nothing we could do about it. The reason is clear – we are passing an object implementing IMyComparer<Person>
but the method expects IMyComparer<Student>
. The types just don’t match. This is unfortunate though. Logically if we have a way to compare two persons, we should be able to use that to compare students because the students are actually persons with some additional characteristics. If we say that two persons with the same name are equal, and we want to use that comparison logic for students, we should be able to do that.
This is where contravariance comes into place. In a very similar way to what we had with covariance, we need to annotate the generic type parameter in the interface. This time instead of out
we’re going to use the in
keyword. Our modified interface definition now looks like this:
public interface IMyComparer<in T> { int Compare(T x, T y); }
In contrast with output positions, input positions are limited to method input parameters and some locations in delegate parameters.
Now our IMyComparer
interface is declared as contravariant. By doing that, we can pass the PersonComparer
to the Compare
method that expects a Student
comparer. The compiler is happy and we have achieved our goal to reuse our person comparer when comparing students!
For completeness, let’s give a formal definition for contravariance in a similar way we did for covariance:
A generic type G<T> is contravariant with respect to T if G<X> is convertible to G<Y> given that Y is convertible to X
If this still doesn’t sound very intuitive, just try substituting X and Y with Person and Student. The definition would become:
A generic type G<T> is contravariant with respect to T if G<Person> is convertible to G<Student> given that Student is convertible to Person
I hope this now makes sense in the context of the comparer example we’ve been looking at.
Invariance
There are occasions when we have to use an exact type match. These are the cases when the generic parameter is both at input and output positions in the interface.
Previously we’ve seen that IEnumerable
is covariant due to the fact that its operations are read only with respect to the T
parameter. This is not the case with some other container-based interfaces though. Let’s take IList
as an example:
public interface IList<T> : ICollection<T>, IEnumerable<T>, IEnumerable { T this[int index] { get; set; } int IndexOf(T item); void Insert(int index, T item); void RemoveAt(int index); }
We can’t declare this interface as covariant or contravariant because the T
parameter acts both as input and output. If we could treat lists covariantly, we would get into exactly the same situation as with did with arrays:
var students = new List<Student>(); IList<Person> people = students; people.Add(new Teacher("Peter")); Student student = students[0];
This code doesn’t compile and we can’t get away with anything but an exact type match.
Action and Func Delegates
So far, we’ve only investigated generic interfaces, but delegates can also have generic parameters so it’s worth spending some time exploring them too in the context of our type variance discussion.
Recall the signatures of the Action
and Func
delegates:
public delegate void Action(); public delegate void Action<in T>(T obj); public delegate void Action<in T1,in T2>(T1 arg1, T2 arg2); // more Action delegates public delegate TResult Func<out TResult>(); public delegate TResult Func<in T,out TResult>(T arg); public delegate TResult Func<in T1,in T2,out TResult>(T1 arg1, T2 arg2); // more Func delegates
Notice the usage of the in
and out
keywords. I am quite sure that after our discussion so far this will make a very good sense to you now – output parameters are declared covariant, input parameters are contravariant.
However, when these delegates are used in generic interfaces, it can get a little confusing. I will go through some specific examples you may find puzzling.
Consider the following interface:
public interface ICovariantInterface<out T> { T GetAnItem(); Func<T> GetAnItemWhenNeeded(); void GetAnItemAndPassItToMe(Action<T> callback); }
You can see the T
parameter is covariant. This makes perfect sense for the GetAnItem()
method as it returns a T
.
How about GetAnItemWhenNeeded()
? This method returns a function that the client can call later to retrieve an item T
. So, T
is still being returned from the method but this time lazily by calling the returned function. I guess this also makes sense.
The third method in the interface – GetAnItemAndPassItToMe
is probably the most confusing one. The Action<T>
callback is passed as a method argumen but the type parameter T
itself is at output position. Why is that? Well, very similarly to the Func
logic. Although Action
is passed as an input, it is being used to actually return a T
by passing it to the Action
provided by the caller. Which semantically still means that T
is being returned from this method. Just not directly but by calling a callback instead.
I guess this might have been a little confusing. It’s perfectly fine if you need to go over it once again. Delegates tend to visually “flip” the in/out positions of the type parameter but if you take the time to process it, you’ll see that from logical standpoint T
is still being returned from all the methods in this interface directly or indirectly.
Sure enough, we can build a similar example for contravariance:
public interface IContravariantInterface<in T> { void ActOnAnItem(T item); void ActOnAnItemWhenNeeded(Func<T> item); Action<T> ActOnAnItemAndPassItToMe(); }
The T
parameter is contravariant. The intuition why this is the case is very identical to the one we used in the covariance example but the logic is in the opposite direction. So I’ll leave this for you to figure out.
Summary
Type variance can be confusing topic at first. I’ve spent some time experimenting with different methods and interfaces until I managed to wrap my head around it. I’ve tried to explain the matter very practically with a lot of examples hoping that would work for you!
Thanks for reading.
Resources
- Effective C#, Bill Wagner
- http://tomasp.net/blog/variance-explained.aspx/
- https://stackoverflow.com/questions/18666710/why-are-arrays-covariant-but-generics-are-invariant
- https://stackoverflow.com/a/2745301/3270139
- https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/concepts/covariance-contravariance/
- https://stackoverflow.com/questions/2184551/difference-between-covariance-contra-variance
- https://en.wikipedia.org/wiki/Liskov_substitution_principle
Hi, I like this article. I was struggling to understand co- and contravariance, but this explains it in practical manner, which is more suited for me.
I have questions for the most confusing parts in delegates 🙂
In the covariance interface ICovariantInterface there is the method
void GetAnItemAndPassItToMe(Action callback);
Shouldn’t this be
void GetAnItemAndPassItToMe(Func callback); ?
Because Func returns the type T, so T is readonly and can be an out type. Action expects T as an input, and I guess it can potentionally modify it, so it is not readonly, so it cannot be covariant.
With the same reasoning this looks confusing too in IContravariantInterface:
void ActOnAnItemWhenNeeded(Func item); -> void ActOnAnItemWhenNeeded(Action item);
But maybe I just misuderstand it.
Hey!
Glad to hear you found this article useful.
The confusion that you have is something that everyone is getting through until it “clicks.”
Let me give an example.
Say you have the following function:
Person GetPerson()
As a client of this function, you may want to get the Person and print its name like so:
var person = GetPerson();
Console.WriteLine(person.Name)
I hope we agree that the Person here is clearly at an output position as it’s returned from the function.
Let’s now see another example: callback)
void GetPerson(Action
As a client of this method, you can accomplish absolutely the same thing as in our first example. The only difference is that the function doesn’t directly return a person but through a delegate. Let’s see that in action: callback = (Person p) => Console.WriteLine(p);
Action
GetPerson(callback);
Do you see how in both of the cases, the GetPerson method is returning a person to the client? In the first case, this is directly through the return type of the function, while in the second case, it’s by passing a person to a delegate provided by the client. Logically, both of the functions return a person and that’s why we consider Person to be at an output position in both of the cases.
I hope that made the case clearer. Let me know if I can help you further.
Thank you, that did help me to understand it. I played around with it in VS and its a bit weird but makes sense now.
I was confused that in void GetAnItemAndPassItToMe(Action callback); T is both an input and output type, and does that mean that it can be only an Action if used within a function with ICovariantInterface, but I figured out that no.
So thanks I kinda get it now:)