Languages don't have speeds

Oliver Oliver • Published one month ago Updated one month ago

I often hear "C++ is faster than C#", or “Python / Java is slow”. This irks me, and here's why.

The only “language” that your CPU understands is its native instruction set. If you're on a desktop, that's almost certainly going to be some variation of x86 (more specifically it's going to be the 64-bit extension x86-64). If you're on mobile, it's probably going to be some version of ARM.

Your Python code doesn't get executed, nor does your C# code, nor does your Rust code. Saying that any given language is faster than any other given language is a nonsense statement, because the language itself isn't really being “executed” in any way.

What I believe is being asserted here is the more believable claim “compiled languages are faster than interpreted languages”, but even this is misleading because once a language is compiled, it's converted into the CPU's instruction set (or an intermediate format such as CIL or Java bytecode, which is then converted to native machine code by the JIT), and such a compiled language still doesn't get executed.

The same is ultimately true of interpreted languages but in a much more abstracted manner. Though an interpreter doesn't generate native instructions as such, the interpreter itself already is native1 and as such all of the syntax in the target language “maps” to native instructions already, sort of. Interpreters may be simple but they're tricky to explain.

But ultimately this doesn't matter. A language being compiled or interpreted is no real indication as to its “speed”. The speed comes from the instructions which are actually executed on the CPU, and this is nothing to do with the language and everything to do with how that language gets compiled or interpreted.

The exact same logic in several languages, even something as simple as printing Hello World to stdout, can be executed on the CPU in more than one way. Hello World in C# behaves differently to Hello World in C++, and this isn't because one language is any different to the other, it's to do with how the compiler decides to convert your code into machine code.

You see, any language can be compiled to machine code. There is a difference between the language rules and syntax that you write, vs the compiler/runtime environment which takes that language as input. C++ developers are all too aware at how the exact same source code as input, can yield drastically different outputs purely depending on the compiler which is parsing it and the various optimisation levels you can specify. The same code fed into MSVC, for example, can produce a very different output from GCC or clang. Though all of these compilers output machine code, they output it differently and utilise different techniques and different optimisations.

You would think that the way to write the most efficient code would be to use one of the languages in the Assembly family for your specific CPU architecture, but oftentimes even this isn't the case as compilers are actual fucking wizardry and witchcraft taking advantage of countless optimisation techniques that humans would never dare think about. It's oftentimes a far better solution to use a higher level language anyway. Does this mean that <insert any language here> is faster than Assembly? Is C++ faster than assembly? No. Again, C++ isn't fast. The machine code generated by the compiler is fast. Unless you know the hardware you're developing for in extensive detail, a C++ compiler is pretty much always going to beat your attempts at producing efficient results. However that's not a benefit inherent to using C++, it's a benefit of having access to a good compiler for it.

It's entirely possible to write a compiler for Python - for example - which outputs the exact same machine code, and utilises the same optimisations, as any other compiled language. In fact Python functions as both an interpreted language, and an intermediately-compiled one too. This is even further evident by the fact that there exists a version of Python which runs on the CLR called IronPython. This allows you to take advantage of all the benefits and optimisations inherent to .NET, having it compile to the same CIL as C# and VB and all the other .NET languages.

How I envision you picturing me right now.

It might seem like a bit of a technicality. Maybe it is considering the language you're using generally dictates which compiler you'll be using, which ultimately determines how your source gets compiled/interpreted. It's not like you can force JavaScript to compile straight to x86 instructions to forego the interpreting engine. Except… you actually can. In fact, you can do this for any language. You can make your own compiler to take in any source language you want and output efficient machine code. If we were to all start doing this, every language can “be as fast” as every other language, because they'd all compile the same.

Languages don't have speeds. Some compilers just produce better results than others.

Update 4 May 2024

I forgot to mention in this post the very fact that C# can compile in two different ways. The traditional way is having it compile to CIL, but in recent times the .NET team have introduced NativeAOT which has C# compile straight to a native binary.

You can't say “C# became faster”, because the language didn't change in any way nor did it have to. The compiler changed.

  1. Generally. It's possible to write an interpreter which is itself written in an interpreted language, but that would just be insane, right?

Previous Post

The American