2009/09/30 by Andrew Smallshaw
This is in response to The C programming language and its importance
I've seen many gushing uncritical treatises praising C to the hilt over the years. They usually strike me as naive and this one is no exception. This one at least avoids portraying C as all things to all people, but ultimately C is in many ways a backwards and primitive language.
In some ways C was ahead of its time: removing I/O operations from the language proper and relegating them to library functions for instance. It avoids C feeling too odd when you are writing a GUI or embedded app. If you compare it to Fortran for instance, where these are language primitives, it is much more elegant - there are no actual keywords to be avoided in situations where they are not relevant.
However C still has many obvious flaws. The curious operator precedence comes to mind straight away - there is only one sane way of parsing something like (a == b & c), but it is not the one C chooses. Syntactic limitations, such as the lack of a "then" keyword, were introduced for reasons that now seem laughable, but still remain making many simple typographical errors result in syntactically valid but incorrect code - if you have ever spent an hour wondering why "if (a == 1);" apparently behaves as a tautology, this is what caught you out. It also lacks many facilities you would expect of a modern programming language.
For example, the lack of any form of exception handling is inexcusable in a modern language. As a result you are constantly manually checking return values for fringe conditions, and then (in theory at least) passing those errors back up the program structure until such a time that hopefully you are able to handle them in a sane manner. Often the logical structure of a program needs wholesale alteration simply to allow these error conditions to be propagated from one module to another. This is a burden the C programmer should not have to face.
However, the biggest problem has to be memory management, or more to the point, the almost complete lack thereof. You can do pretty much whatever you want in C but you are doing it all yourself. It shows itself up first and foremost in what are laughably termed strings - an operation that should be a simple function call turns into a lengthy sequence of operations, but before you even start you have to calculate how long the resulting string is going to be and allocate the space for it manually. Of course, that in turn means you have to work out how to communicate a failure to allocate that memory to the rest of the program, not to mention determining when and where to conveniently free the space when you are done with it.
More complex data types can be even more problematic: a simple linked list soon becomes a no-brainer, and can be compacted down to a couple of sections of maybe half a dozen lines each, but the reason it becomes so straightforward is simply because you do it so frequently, for C has no built in list type. When implementing more complex data structures, managing memory and the relationships between elements becomes exponentially more difficult than solving the actual problem at hand. Using ML I can in an afternoon write a rich API implementing a sophisticated data structure, with multiple many-many relationships and shared elements between distinct data structures. In C I will still be chasing NULL and dangling pointers, twiddling reference counters and praying I don't introduce memory leaks a fortnight later. In practice this tends to mean that simplistic structures are chosen over more advanced ones: why do you think there are so many buffer overflow security vulnerabilities?
Next we have C's supposed strengths. The obvious one is that is can do things that no other language can. This is demonstrably false. Leaving aside the earlier-mentioned problem of feasibility - something may be possible but so fiendishly complex it is not economic - it is not true in any case. When it comes to direct hardware hardware manipulation the supposed power of C (pointers) is hardly unique nor even the only way to solve the issue. I am of the generation that grew up with the likes of the Sinclair Spectrum and Commodore 64 - we got quite used to PEEKing and POKEing our way through system memory even in interpreted Basic. In any case how do you handle interrupts or program DMA transfers in C? You use a library function to do it - there are no native capabilities. This is hardly an approach inapplicable to other languages.
C does benefit from the fact that its runtime support library is very compact. It is true that this makes it appropriate for embedded and other low level systems. However, this is not a property unique to C. Forth and even some Basic dialects have similarly small footprints. Or of course you can go for assembler with no overhead at all. When memory gets really constrained even C's malloc() gets troublesome, since it inevitably leads to memory fragmentation and for this reason embedded systems often forgo its use entirely. It may surprise you to learn that up until 10-15 years ago you didn't even have a malloc() available when Unix kernel programming, and I for one would argue that introducing it was a retrograde step.
A real example of hardware access: consider Sun's OpenPROM system. This is the equivalent of the BIOS you will find on Sun hardware, but it is a bit more advanced in that it also incorporates other capabilities such as programmability and proper diagnostic routines. It was standardised and later picked up by Apple for its machines, which we will come to in a minute. The bulk of the system is actually written in Forth. Each add-on card also includes its own add-on drivers and diagnostics that integrate themselves into the system. The drivers remain available and in use even after the operating system has booted if the OS lacks full native driver support for a piece of hardware. The neat thing is that this works regardless of the CPU: it is executed by a tiny Forth interpreter and so it works on both 32 and 64 bit SPARC, and on Apple's PowerPC and x86 architectures. Supposedly this is impossible in anything other than C, but you would be hard pressed to use it to do anything of the sort.
Of course, all this is not to say C is completely without merit: one the contrary, it is still one of my preferred languages. It is often described as a small language, but I dislike that characterisation: it is more mid-sized. TCL or Scheme are true small languages: it does not benefit you at all since all that happens is that the complexity moves to the library. At the other end of the scale you have the truly large languages such as C++ or ADA where the entire language is almost too big to keep in your head at once. C is in the Goldilocks zone somewhere in the middle: the language is large and expressive enough to be useful without ever being daunting.
However, the real strength of C is nothing to do with the virtues of the language per se at all. That is its very ubiquity. Almost any computer platform you care to mention is going to have at least one C implementation for it. In addition, any third party libraries you may want to use are going to be either written in C or at least have C linkage, so if you want to simultaneously pull in libraries for database access, a network protocol, hardware control and still have full system call capabilities that is not a problem in C. It it this position as the lingua franca of computer programming that is its single greatest single strength.
Got something to add? Send me email.
Increase ad revenue 50-250% with Ezoic
More Articles by Andrew Smallshaw
© 2009-11-07 Andrew Smallshaw