As you might be aware, I'm a huge .NET fanboy, and so today we're not going to talk about PowerShell, but about a core feature of my first love in that ecosystem - C# and it's awesome support for generic types!
C#'s Quirky Syntax(es)
I've recently encountered multiple individuals who indepently of each other were discovering and experiencing the magic of LINQ and marvelling at its...
I don't know why the world needed another syntax for querying... pic.twitter.com/iXsVSLjwel
— Stephen 🦊 Owen (@FoxDeploy) July 24, 2018
... ok, maybe it's not initially the most popular query syntax ever for everyone, but I digress.
LINQ - Language Integrated Query - is a combination of language features in C# (3.0) and a set of accompanying .NET extension methods introduced in .NET Framework 3.5, with the C# part covering the query syntax itself.
What does it look like in practice? Let's take a look at the following piece of C#:
int[] numbers = new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
var everyThird = from number in numbers
where number % 3 == 0
orderby number descending
select number;
If you're familiar with other query languages like SQL or WQL, this should look oddly familiar, you might even be able to guess exactly what the result is - an array like this: [9,6,3]
.
In the example above, we query an in-memory data source - the numbers
array - and an implicit provider (the LINQ to Objects provider) then translates it into a series of method calls that the C# compiler can turn into IL (think of this provider as a compiler plugin that aids the parser), the Intermediary Language instructions that the .NET runtime can execute.
The query above roughly translates to the following chain of C# method calls:
int[] numbers = new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
var everyThird = numbers.Where(number => number % 3 == 0)
.OrderByDescending(number => number)
.Select(number => number);
Not only is this functionally equivalent (it still outputs an array [9,6,3]
) to the query syntax example, it compiles to the exact same IL.
This is the resulting IL from compiling the first and the second example, respectively, in the Main
method of an application:
The only difference in the two resulting executables is that the LINQ provider optimizes away the last .Select(number => number)
statement (since it's a functional no-op anyway), in the query syntax version. Notice how the offset of the one compiled from the second example is slightly further along - that's to make room for the Select(number => number)
method call.
What does this have to do with generics?
In order to allow the from
keyword to query any kind of thing (in SQL this would be a row, in WQL an instance) from any kind of data store (in SQL this would be a table, in WQL a class in a repository namespace), there's only one requirement:
- The target must implement either
IQueryable<T>
orIEnumerable<T>
.
But what is T
? These, ladies and gentlemen, are generic interfaces. That is, T
is not actually a compile-time defined type, it's simply a placeholder for a type parameter that will be specified later.
It turns out that plenty of data structures contain or reference other objects, the type of which are simply not important. Think of an array for example - whether it's an int[]
or a string[]
is not that important - the array will still have a number of behaviors defined (indexing, counting the length, linear search etc.) completely orthogonal to the type of data it references, and therefore these behaviors can be defined without knowing the type.
In short...
generics allows us to define and compile structured and complex data types and methods that can be reused for an arbitrary number of simpler data types
The most common examples are collection types (List<T>
, Stack<T>
, Queue<T>
and so on), but as mentioned above this is not a hard requirement.
Enough with the mumbo-jumbo, get to the point!
Alright, let's take an example of why this may be useful, by implementing a method that rotates arrays to the left! Array rotation is a somewhat simple operation, a sort of "circular shift" if you will:
- Imagine a integer array like this:
[1,2,3,4]
- A left-rotation of length 1 would result in
[2,3,4,1]
- A right-rotation of length 1 would result in
[4,1,2,3]
For an int[]
in C#
, we might do something like this:
static void RotateLeft(int[] array, int count)
{
int[] buffer = new int[count];
Array.Copy(array, buffer, count);
Array.Copy(array, count, array, 0, array.Length - count);
Array.Copy(buffer, 0, array, array.Length - count, count);
}
The problem with this is that it only works on int[]
- if I include this in a library or as part of a bigger solution, someone else might come along with a use case for a long[]
, a string[]
or a CoolBusinessLogicObjectIHaventThoughOf[]
, and now they'll have to copy-paste and refactor this method, even though the steps for rotating the array remain exactly the same.
I may end up with something like:
static void RotateLeft(int[] array, int count) { /* ... */ }
static void RotateLeft(string[] array, int count) { /* ... */ }
static void RotateLeft(byte[] array, int count) { /* ... */ }
static void RotateLeft(SomeClass[] array, int count) { /* ... */ }
static void RotateLeft(AStruct[] array, int count) { /* ... */ }
... which is a maintenance nightmare - especially when I tell you that we forgot to add bounds checking for count
in the original method before copy-pasting all that code and now we'll have to manually rewrite all of them...
RotateLeft<T>()
to the rescue!
Wouldn't it be nice if we could describe the algorithm for rotating the array once, and then have it be reusable for any type of array? Fortunately this is exactly what generics excel at, so let's try and rewrite RotateLeft()
as a generic method. How, you ask? It's so ridiculously easy you won't believe but it literally only takes 2 changes:
- Add a type parameter placeholder to the method signature
- Replace the target data type with the placeholder from step 1
So, we end up with something like this:
static void RotateLeft<T>(T[] array, int count)
{
T[] buffer = new T[count];
Array.Copy(array, buffer, count);
Array.Copy(array, count, array, 0, array.Length - count);
Array.Copy(buffer, 0, array, array.Length - count, count);
}
That's it - and now we never have to copy paste this method again. This brings down the maintenance cost, since we can now fix the missing bounds check from before just once, and it's fixed for all types:
static void RotateLeft<T>(T[] array, int count)
{
count = array.Length % count;
T[] buffer = new T[count];
Array.Copy(array, buffer, count);
Array.Copy(array, count, array, 0, array.Length - count);
Array.Copy(buffer, 0, array, array.Length - count, count);
}
When another user calls our new generic method, it'll look completely indistinguishable from the original:
int[] numbers = new[]{ 1, 2, 3, 4 };
RotateLeft(numbers, 1);
But it now works with any array type as its argument:
string[] words = new[]{ "1", "2", "3", "4" };
RotateLeft(words, 1);
The user can explicitly pass a type parameter, like so: RotateLeft<int>(numbers, 1)
, but in our case this is redundant since the compiler can simply infer the type parameter from the numbers
or words
argument.
This is what makes generics so useful - I can define data structures or methods that operate on arbitrary types of input data that I don't have to know about, and the user of my code will still get the benefit of type safety since the input type is defered until they decide what to reuse my code for - that's freaking awesome!
I hope this showcased, at least at a very basic level, the nature and usefulness of generics in C# and .NET