The pitfalls of C#

If you've ever heard me talk about programming in any regard, you'll likely be aware that I simp for C#. I have never met any other language that lets me develop at the speed I'm able to. It is hands down my favourite language and I will challenge anyone who tries to change my mind.

But that doesn't mean it's perfect. C#, much like every other language, has its drawbacks. I'm going to talk about some of the features that I've stumbled across in other languages that accomplish some goal better than C#.

Generics

Okay, so I'm starting with a controversial one. I gotta get this one out of the way first.

Type erasure

So I often give Java a lot of shit because it's subject to something called type erasure. Sure - Java as a language supports generics - but the JVM does not. As such, generics are erased at compile time. Consider this example in C#:

int Sum(<mark>List<int> numbers</mark>)
{
    var total = 0;
    foreach (int item in numbers)
    {
        total += item;
    }
    return total;
}

float Sum(<mark>List<float> numbers</mark>)
{
    var total = 0;
    foreach (float item in numbers)
    {
        total += item;
    }
    return total;
}

We are able to define an overload of Sum which accepts two variants of List<T> - one for int, and one for float. Since generics are supported by the runtime, these can exist as two entirely disparate overloads and passing in a list of either type will correctly resolve.

In Java, such a thing is not possible due to type erasure. We could try to create the functions:

int sum(<mark>ArrayList<Integer> numbers</mark>) {
    int total = 0;
    for (int item : numbers) {
        total += item;
    }
    return total;
}

float sum(<mark>ArrayList<Float> numbers</mark>) {
    float total = 0;
    for (float item : numbers)
    {
        total += item;
    }
    return total;
}

But we'll immediately hit a compile error. Since the JVM does not support generics, the type attribute of ArrayList gets erased which leads to two functions with the same parameter list signature: sum(ArrayList).

C# generics being baked into the runtime solve this problem - but they introduce another:

Strongly typed limitations

Consider a simple Max function which returns the maximum of two values:

T Max<T>(T a, T b)
{
    return a > b ? a : b;
}

This looks seemingly innocuous. If the evaluation of a is greater than that of b, then a is returned - otherwise b is returned. The problem is this won't work. Generics are baked into the runtime, and this means this method must be callable with any and all T. The thing is, that could be decided at any point by other calling code in an entirely different assembly. For this method to be valid, the compiler must be able to know that the > operator is defined for T - which is not guaranteed to be the case here.

If I were to create a very simple class which has a single string property, and pass in two instances of it to Max, what on earth should it do? The question is unanswerable because such a thing is invalid. There are two ways to accomplish this in C#, and that'd be constraining T to IComparable<T> and calling CompareTo, or by using the new interfaces introduced in .NET 6 which allow for generic math. That's right. Microsoft had to invent an entire workaround for this because they realised this is actually a fairly common thing to need to do which was quite simply impossible until now.

Templates do it better

C++, on the other hand, has a very powerful template system. Templated functions are special - until they're used, they simply don't exist. The name is quite literal, in all honesty. It defines a template - it specifies how the function should be defined when and only when it's necessary to do so. Consider this max function in C++:

template<typename T>
T max(T a, T b) {
    return a > b ? a : b;
}

As it stands on its own, this function won't exist. There is no max anywhere until it's used. As soon as we introduce code which calls it, T is resolved from the argument types and a real function is generated which essentially replaces T with that type.

In other words: if we were to call max(2, 5), the compiler would see the arguments of type int and actually generate a function whose signature is int max(int a, int b). This will then compile fine, as > is defined for int, and max behaves as intended. Call it on 2 floats? Now there's a float max(float a, float b) function. Call it for a custom struct which defines the > operator? Cool, now there is a MyStruct max(MyStruct a, MyStruct b) function too. As long as the type you specify defines the > operator, this function works perfectly fine.

And that is reason 1 why C# is lacking.

Onto reason 2!

Enums

Okay so Java has a unique ability to drive me actually insane with its lack of stack allocated types, operator overloading, extension methods, and more. But it does have one thing incredibly useful: enums behave like classes.

Let's suppose we have an enum which contains the days of the week. Let's first define it in C#:

enum DayOfWeek
{
    Monday,
    Tuesday,
    Wednesday,
    Thursday,
    Friday,
    Saturday,
    Sunday
}

Nice and simple. Now let's do the same in Java:

enum DayOfWeek {
    MONDAY,
    TUESDAY,
    WEDNESDAY,
    THURSDAY,
    FRIDAY,
    SATURDAY,
    SUNDAY
}

Nothing out of the ordinary so far. But now suppose we wish to store related information about each entry. Perhaps we want to be able to easily determine if a given DayOfWeek is on the weekend. In C#, we could achieve this with an extension method:

static class DayOfWeekExtensions
{
    public static bool IsWeekend(this DayOfWeek dayOfWeek)
    {
        return dayOfWeek is DayOfWeek.Saturday or DayOfWeek.Sunday;
    }
}

This lets us call DayOfWeek.Monday.IsWeekend() which will return false, and DayOfWeek.Saturday.IsWeekend() which will return true.

The problem is ultimately this: the extension method doesn't have to be implemented in the same assembly - there's no inherent reliability in the result actually being accurate. A third party could have implemented this extension method with malicious intent, which is allowed because extension methods provide a contract which the original author did not intend.

Aside from that, we also had to define an entirely new static class just to create an extension method to return a simple boolean about the state of the enum value. How wasteful.

Java does this better. In Java, you can define this very state as part of the contract itself. We can define the DayOfWeek enum like so:

enum DayOfWeek {
    MONDAY(false),
    TUESDAY(false),
    WEDNESDAY(false),
    THURSDAY(false),
    FRIDAY(false),
    SATURDAY(true),
    SUNDAY(true);

    private final boolean _isWeekend;

    private DayOfWeek(boolean isWeekend) {
        this._isWeekend = isWeekend;
    }

    public boolean isWeekend() {
        return this._isWeekend;
    }
}

You'll notice some slightly strange syntax - the entries in the enum are invoking a constructor - which caches the argument value as a field. The isWeekend method is baked into the enum itself, and returns the value of this field. We did not have to define any sort of helper class nor do we have to just implicitly “trust” that the result is accurate. Since the condition of being on the weekend is provided by the enum, we can reasonably assume this contract is reliable. It also just simply makes sense in general. “Weekend-ness” is inherent to the day of the week and such a property will never change, so why not? Why shouldn't you be able to do something like this?

I wish C# enums had this.

Anonymous interface implementations

Okay, last Java one I swear.

Java allows for anonymous derivations of interfaces inline. There is simply no analogue to this in C#. Suppose we have an interface like so:

interface IMyInterface
{
    void Foo();
}

And a method which accepts this interface:

void CallFoo(IMyInterface value)
{
    value.Foo();
}

In order to actually call this method, I need an instance of a type which implements this interface. In C#, this means defining a whole new class or struct which implements IMyInterface and the Foo method. However, in Java, this is not necessary. You can create anonymous implementations like this:

interface MyInterface {
    void foo();
}

void callFoo(MyInterface value) {
    value.foo();
}

callFoo(new MyInterface() {
    @Override
    public void foo() {
        System.out.println("Hello World");
    }
});

You don't need to create an entire class for it, if this is the only time such an implementation is required. I'll be honest, the first time I saw this syntax it threw me off a lot - but that was before I understood what was really happening here. I must say, I'm jealous that Java has this feature.

Strictly typed integer ranges

Let's go for a really obscure one. Ada.

As you'd expect, Ada supports integers. Duh. But one thing about it that C# does not have is the ability to define ranged integers. This means, for example, we could define a Month type which only accepts values 1 through 12.

type Month is range 1 .. 12;

procedure Foo
    ThisMonth : Month := 11
begin
    -- code
end Foo;

Values that don't fall within this range present an error, and there is no simple way to accomplish this in C#. We are given primitives: byte, short, int, long - and most of the time we use int to represent most things, which means we ultimately have no choice but to allow values up to 2 billion, only to throw an exception when it's too late (during runtime).

It would be nice to have compile-time validation of an integer's value falling within a specific range. Ada, you have a thumbs up from me.

Typedef

C/C++ allows you to essentially redefine a type with a new name using the typedef keyword. This means we can have platform-dependent definitions. We could define something like:

#ifdef _WIN32 || _WIN64
#   ifdef _WIN64
typedef long long ptraddr;
#   else
typedef long int ptraddr;
#   endif
#endif

// yes, I know this will only work on MSVC. GCC has its own defines. pipe down!

Of course, a real use of typedef would be more nuanced than this. But this lets us have ptraddr be an alias to either long long or long int depending on the platform.

C# kind of has an analogy to this, in the form of a using alias:

#if WIN64
using PtrAddr = System.Int64;
#elif WIN32
using PtrAddr = System.Int32;
#endif

But the problem with this is that we have to define WIN32 and WIN64 ourselves in the csproj, or as compiler flags. The alias also only exists in a single compilation unit prior to C# 10, which means if we wanted to use PtrAddr in other files, we'd have to repeat this code. (In C# 10 and later, this can be achieved via a global using directive). The nature of C/C++ header files makes this redundant, we only need to write the typedef once. A using alias just doesn't cut it.

Static locals

The last feature I'd like to fanboy over comes from VB and PHP.

In VB, we're able to define a local variable whose value will persist between invocations while still keeping the variable in tight scope:

Sub CountUp()
    <mark>Static counter As Integer = 0</mark>
    counter += 1 ' VB has no ++ though, so C# wins here
    Console.WriteLine(counter)
End Sub

In this situation, counter is only accessible from within the CountUp method, yet the value of counter increases with each call. Namely, calling the function like so…

CountUp() ' 1
CountUp() ' 2
CountUp() ' 3
CountUp() ' 4
CountUp() ' 5

... would print the values 1 through 5 in succession.

In PHP, an equivalent function looks like this:

function countUp() {
    <mark>static $counter = 0;</mark>
    $counter++;
    printf("%d\n", $counter);
}

countUp(); // 1
countUp(); // 2
countUp(); // 3
countUp(); // 4
countUp(); // 5

The way this works in VB is the compiler actually generates a field for the variable, which is how its value is persistent, but this field is generated with a name that contain “illegal” characters so accessing it is only possible with reflection. In C#, the closest equivalent implementation would be to create your own field like so:

private int _counter = 0;

void PrintNumber()
{
    _counter++;
    Console.WriteLine(_counter);
}

However, the problem now becomes that _counter is within scope for the entire type, which means other methods in this type could modify this value. A static local solves this issue, and sadly there is no way to create one in C#.

Final notes

I'd like to stress that I'm aware C# is not without its flaws. The next time you hear me boast about how C# is superior or how another language is shit in comparison, remember that I understand there are pros and cons to both sides. I'm well aware that other languages do some things better, and we haven't even spoken about how poor the CLR garbage collection process is.

But despite all of this, I won't be switching languages any time soon. I accept my fate and resign to working with what I know best.

If you have any features from other languages (that aren't your primary language) that you are jealous of, and would wish your preferred language to have, I'd be very curious to hear it. Leave a comment and let me know.

What language features do you envy?

_{Legacy comments are comments that were posted using a commenting system that I no longer use. This exists for posterity.}