C is for Crap

2009/09/30 by Andrew Smallshaw

This is in response to The C programming language and its importance

I've seen many gushing uncritical treatises praising C to the hilt over the years. They usually strike me as naive and this one is no exception. This one at least avoids portraying C as all things to all people, but ultimately C is in many ways a backwards and primitive language.

In some ways C was ahead of its time: removing I/O operations from the language proper and relegating them to library functions for instance. It avoids C feeling too odd when you are writing a GUI or embedded app. If you compare it to Fortran for instance, where these are language primitives, it is much more elegant - there are no actual keywords to be avoided in situations where they are not relevant.

However C still has many obvious flaws. The curious operator precedence comes to mind straight away - there is only one sane way of parsing something like (a == b & c), but it is not the one C chooses. Syntactic limitations, such as the lack of a "then" keyword, were introduced for reasons that now seem laughable, but still remain making many simple typographical errors result in syntactically valid but incorrect code - if you have ever spent an hour wondering why "if (a == 1);" apparently behaves as a tautology, this is what caught you out. It also lacks many facilities you would expect of a modern programming language.

For example, the lack of any form of exception handling is inexcusable in a modern language. As a result you are constantly manually checking return values for fringe conditions, and then (in theory at least) passing those errors back up the program structure until such a time that hopefully you are able to handle them in a sane manner. Often the logical structure of a program needs wholesale alteration simply to allow these error conditions to be propagated from one module to another. This is a burden the C programmer should not have to face.

However, the biggest problem has to be memory management, or more to the point, the almost complete lack thereof. You can do pretty much whatever you want in C but you are doing it all yourself. It shows itself up first and foremost in what are laughably termed strings - an operation that should be a simple function call turns into a lengthy sequence of operations, but before you even start you have to calculate how long the resulting string is going to be and allocate the space for it manually. Of course, that in turn means you have to work out how to communicate a failure to allocate that memory to the rest of the program, not to mention determining when and where to conveniently free the space when you are done with it.

More complex data types can be even more problematic: a simple linked list soon becomes a no-brainer, and can be compacted down to a couple of sections of maybe half a dozen lines each, but the reason it becomes so straightforward is simply because you do it so frequently, for C has no built in list type. When implementing more complex data structures, managing memory and the relationships between elements becomes exponentially more difficult than solving the actual problem at hand. Using ML I can in an afternoon write a rich API implementing a sophisticated data structure, with multiple many-many relationships and shared elements between distinct data structures. In C I will still be chasing NULL and dangling pointers, twiddling reference counters and praying I don't introduce memory leaks a fortnight later. In practice this tends to mean that simplistic structures are chosen over more advanced ones: why do you think there are so many buffer overflow security vulnerabilities?

Next we have C's supposed strengths. The obvious one is that is can do things that no other language can. This is demonstrably false. Leaving aside the earlier-mentioned problem of feasibility - something may be possible but so fiendishly complex it is not economic - it is not true in any case. When it comes to direct hardware hardware manipulation the supposed power of C (pointers) is hardly unique nor even the only way to solve the issue. I am of the generation that grew up with the likes of the Sinclair Spectrum and Commodore 64 - we got quite used to PEEKing and POKEing our way through system memory even in interpreted Basic. In any case how do you handle interrupts or program DMA transfers in C? You use a library function to do it - there are no native capabilities. This is hardly an approach inapplicable to other languages.

C does benefit from the fact that its runtime support library is very compact. It is true that this makes it appropriate for embedded and other low level systems. However, this is not a property unique to C. Forth and even some Basic dialects have similarly small footprints. Or of course you can go for assembler with no overhead at all. When memory gets really constrained even C's malloc() gets troublesome, since it inevitably leads to memory fragmentation and for this reason embedded systems often forgo its use entirely. It may surprise you to learn that up until 10-15 years ago you didn't even have a malloc() available when Unix kernel programming, and I for one would argue that introducing it was a retrograde step.

A real example of hardware access: consider Sun's OpenPROM system. This is the equivalent of the BIOS you will find on Sun hardware, but it is a bit more advanced in that it also incorporates other capabilities such as programmability and proper diagnostic routines. It was standardised and later picked up by Apple for its machines, which we will come to in a minute. The bulk of the system is actually written in Forth. Each add-on card also includes its own add-on drivers and diagnostics that integrate themselves into the system. The drivers remain available and in use even after the operating system has booted if the OS lacks full native driver support for a piece of hardware. The neat thing is that this works regardless of the CPU: it is executed by a tiny Forth interpreter and so it works on both 32 and 64 bit SPARC, and on Apple's PowerPC and x86 architectures. Supposedly this is impossible in anything other than C, but you would be hard pressed to use it to do anything of the sort.

Of course, all this is not to say C is completely without merit: one the contrary, it is still one of my preferred languages. It is often described as a small language, but I dislike that characterisation: it is more mid-sized. TCL or Scheme are true small languages: it does not benefit you at all since all that happens is that the complexity moves to the library. At the other end of the scale you have the truly large languages such as C++ or ADA where the entire language is almost too big to keep in your head at once. C is in the Goldilocks zone somewhere in the middle: the language is large and expressive enough to be useful without ever being daunting.

However, the real strength of C is nothing to do with the virtues of the language per se at all. That is its very ubiquity. Almost any computer platform you care to mention is going to have at least one C implementation for it. In addition, any third party libraries you may want to use are going to be either written in C or at least have C linkage, so if you want to simultaneously pull in libraries for database access, a network protocol, hardware control and still have full system call capabilities that is not a problem in C. It it this position as the lingua franca of computer programming that is its single greatest single strength.



Got something to add? Send me email.



15 comments



Increase ad revenue 50-250% with Ezoic


More Articles by © Andrew Smallshaw







Wed Sep 30 12:32:39 2009: 7013   TonyLawrence

gravatar
Good points, Andrew.

Way back in 1990 I wrote this
(link) on chasing an errant pointer. I haven't done any C in years - maybe that's why :-)



Wed Sep 30 15:18:08 2009: 7016   BigDumbDinosaur

gravatar
In another article, Tony mentions the perils of discussing religion and politics. He must've forgotten that arguing over which programming language is "best" is equally dangerous. <Grin> I won't even bring up the topic of sex.

However, the biggest problem has to be memory management, or more to the point, the almost complete lack thereof.

In Girish Venkatachalam's article, I said that C is a high level abstraction of assembly language. It is this characteristic that produces many of Andrew's beefs with C, but also explains C's popularity in system level programming. As in C, memory management in assembly language is essentially conspicuous by its absence. C was concocted to be a less-unfriendly way of programming close to the machine level, unlike languages like COBOL, BASIC, etc., which attempt to shield the programmer from the underlying operating system and hardware, and don't require (or allow) the programmer to manage memory. Naturally, such shielding comes at a price. For example, while the automatic string management of BASIC is great, it is also relatively slow. Also, some interpreters are balky when a string contains "troublesome" characters, such as 0x00 (null) or 0xFF.

C does some things very well, such as math, which is something that is not easy to correctly implement in assembly language. Also, the automatic use of a stack to handle variables passed to functions is convenient and solves yet another problem than is onerous in M/L. C's ability to define space for a variable using a convenient syntax is certainly a step above having to manually do so in an assembly language program (although well-designed macros can take some of the work out of it). Most importantly, C doesn't constrain a programmer who needs to get up close and personal with the system hardware. That one characteristic would explain why Thompson and Ritchie rewrote the UNIX kernel in C way back when (although not entirely so).

That said, I still tend to reach for an assembler if I am contemplating twiddling bits in a chip register. C can do it, of course, using bit fields and the various shift and mask operators, but the work required to, say, check the source of an interrupt in a 2692 DUART's ISR (register 0x06) is as great as doing it in assembly language, and, of course, slower, since a skilled assembly language programmer can write more succinct code. In either case, you have to define the hardware location of the DUART, as well as the chip register offsets, and you still have to mask bits to determine what is going on with the particular register. So the advantage C might have as a system language is effaced to some extent when diddling with hardware.

...the lack of any form of exception handling is inexcusable in a modern language.

Again, that omission reflects the "low level" nature of C. There's no exception handling in M/L as well, other than interrupts that might be produced due to a memory page fault or similar. I suppose some sort of exception handling could be built into C, but then it would make the language somewhat system-dependent, which it shouldn't be.

However, the real strength of C is nothing to do with the virtues of the language per se at all. That is its very ubiquity.

I suppose the same thing could be said about Windows. <Grin>



Wed Sep 30 17:38:02 2009: 7020   TonyLawrence

gravatar
I suppose the same thing could be said about Windows.

Go wash out your mouth with soap.



Thu Oct 1 11:49:07 2009: 7029   Bob

gravatar
Once you get past specific hurdles in C it becomes very easy to use. Memory management and the use of pointers is not easy too learn but once learned where is the problem? I suspect that the reason for the popularity of C is that it is in fact so easy for those with the talent. For those without there are plenty of other avenues of employment.



Thu Oct 1 14:39:09 2009: 7030   AndrewSmallshaw

gravatar
I think I best clarify here: when I say ML I am referring to the functional language and not machine language. That does indeed give you many high level constructs: exceptions are actually a base type for instance.



Thu Oct 1 15:08:57 2009: 7031   AndrewSmallshaw

gravatar
To respond to Bob: yes, memory management in C is simple but that does not make it straightforward: you lack powerful abstraction abilities. A case in point is the central data structure for a pet project (ie not one I'm being paid for) I have been working on on and off for the past couple of years.

Make no mistake: that is a highly sophisticated structure - high performance, extremely flexible and almost infinitely scalable. However, it is also fairly compact - it is only about 1100 lines long, and when you strip comments, whitespace, braces and assert()s it comes down to barely over 650 lines. I haven't kept track of how much time that code has soaked up but it must be well over 100 hours. Why? Because I don't know what * and & do? Hardly.

It is simply a matter of sophistication - counting up all pointer operations (unary * and &, -> and []) I see well over 3200 operations in that 650 lines - there are many lines with over a dozen such ops in them, many of them working on variable length arrays. That is a huge level of complexity to keep in mind while coding. C provides only limited abilities to abstract that detail away: I mentioned ML in the article and I used that to check that the underlying idea actually worked with no fatal flaws in the logic: I had a working model in less than a day. A complex structure is never going to be trivial to implement but you can do a lot better than the elementary facilities C provides.



Thu Oct 1 15:17:03 2009: 7032   TonyLawrence

gravatar
I agree with Andrew - I've taught this stuff to people and it can be very confusing.

Yes, Bob, some people do have the talent - but with other languages, less of that kind of "talent" is needed. And yes, of course that comes with the cost of bloat, so for those who can do it, C might be "better".

I have long said that the best language is the one you know best.



Fri Oct 2 07:30:36 2009: 7042   drag

gravatar
Yes..

People's obsession with using C for everything is confusing to me. A lot of people have that attitude in Linux and it seems to me that it is a very poor way to go around writing user interfaces and things like that. C++ is not much better.

But the major advantage of C in Linux-land is that C libraries can be imported into the popular higher level languages. So if you want to write a library that will get used in C, Python, Perl, Ruby, etc etc. Then C is the lowest common denominator.

The other major advantage is the kernel development stuff.

------------------

But I will avoid using C as much as possible. I like Python and I can do plenty of binary manipulation and low-level programming in it if that is what is needed. I have not run into a situation yet were I absolutely required anything more low-level, with the sole exception of working on a bootloader for a ARM system which even C is not low-level enough for the first stage. I have to use assembly for putting the cpu and coprocessors into a known state, configuring ram modules, and setting up the flash for initial access. Luckily most of what I need has already been written by other people.

My favorite and most complex thing I've written in python is a binary conversion tool. Its job is to take a flash memory image dump of a Xscale WinCE system obtained through jtag, divide up its blocks, read the wear levelling information, do a number of byte swap operations (the developers of the file system were a bit overzealous with that) and build a drive image containing a fat16 file system that contains the contents of the writable portion of that flash memory. That version of windows CE used a special "transaction safe" fat file system that interacted with a proprietary block to memory translation layer with wear leveling features that I had to reverse engineer.

It took me weeks to work out the format, but once I did it took a single evening to write the utility and another day to debug it. And it was done and ready for production use.

------------

High-level languages are designed as a substitution for effort, not skill. If a skilled programmer is given 40 hours to do a job it'll probably get done better if they used something other then C. You still have to know what the hell is going on to get the best out of them. You still have to understand everything that you have to know to program in C and assembly to really unleash their power and get acceptable performance.

I don't have that skill... I am not terribly good formal programmer, but what it means is that my python programs, even though they are crappy, get done a hell of a lot faster then if I written them in C (and they would still be crappy)






Fri Oct 2 17:42:34 2009: 7050   TonyLawrence

gravatar
Speaking of programming, learning from others and using modules:

Mark Belanger, an old time Unix guy and new reader here, noticed that email comments come through with original html tags.

So if I
BOLD something. you'd see the "b" tags in the email if you subscribe to the comments for that page.

He suggested I use the Perl Mail::Sendmail module, which has the ability to send the email so that your client should see the html text.

Simple enough:

 
my %mail = ( To => $MailList,
From => $LdapUserEmail,
Subject => $Subject,
smtp => $taskglobals::smtp,
'Content-Type' => 'text/html',
'Content-Transfer-Encoding' => '8BIT',
Message => $body
);


Works very nicely - thanks Mark!








Sat Oct 3 03:43:35 2009: 7054   IRJ

gravatar
Interesting article. I believe all languages has its merits and demerits. You choose a language based on what you want your application to do. C's concept is to give complete freedom to the developer. Remember, with freedom comes responsibility. It is natural that people only want freedom and not the responsibility and start blaming C.

In sort I will say -- C is meant for real geeks and nerds. Where as other languages are for a common man to code.

By the way C is definitely not primitive. Fortran existed much before C. That only goes to say that C was designed only with the purpose.

My other point is that arguments like which language is best without analysing for what you want to use only goes to prove that the author no knowledge of the anthropology of computer language and he is inviting trouble for him.



Sat Oct 3 07:53:20 2009: 7055   ZubinMithra

gravatar
i wonder why kernel modules are still written in C(the languages whos users you refer to as 'crap').

Is the C language crappy just because you don`t use it? Sorry, but this point is very offensive(an attempt to get traffic?)



Sat Oct 3 11:57:54 2009: 7056   TonyLawrence

gravatar
Is the C language crappy just because you don`t use it?

Are your comments worth anything if you plainly did not read his article?

Second to last paragraph:

Of course, all this is not to say C is completely without merit: one the contrary, it is still one of my preferred languages.

You really need to READ before you comment.



Sun Oct 4 00:32:12 2009: 7059   BigDumbDinosaur

gravatar
By the way C is definitely not primitive. Fortran existed much before C. That only goes to say that C was designed only with the purpose.

What does relative age have to do with it? BASIC, COBOL, ALGOL and a few others are older than C as well but are definitely high level languages that require little or no understand of the underlying hardware.

C is primitive because it works close to the underlying machine architecture and the native machine language understood by the MPU. While parts of C reflect modern language design (e.g., FOR loops, structure, math handling, etc.), others reflect the machine-oriented nature of the language (e.g., pointers).

To the user who was castigating (I think) Tony for posting this article and referring to C as "crappy," you did a fine job, sir or madam, of reading the article. Perhaps one of my grandsons can teach you the difference between reading and understanding...

------------------------
Kerio Samepage


Have you tried Searching this site?

Unix/Linux/Mac OS X support by phone, email or on-site: Support Rates

This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more.

Contact us





Why bother with subroutines when you can type fast? (Vaughn Rokosz)

He who hasn't hacked assembly language as a youth has no heart. He who does as an adult has no brain. (John Moore)








This post tagged: