APLawrence.com -  Resources for Unix and Linux Systems, Bloggers and the self-employed


First, certain Fox News reporters and some military types drive me nuts mispronouncing this. It's "kash", not "kash-ay" (cachet, which is pronounced that way, is related by origin but is quite distinct in meaning) and that's all that needs to be said. Except perhaps "Foo!" toward those who perpetuate this barbarism. I get grumpy about this kind of thing: I also deplore the recent trend of using "monetize" where "commercialize" is the proper word. Anyway..

The barbarians are referring to weapons caches, and we'll be looking at data caching (instructions just being a form of data too), but it's the same idea: storing something where you can get it when you need it. In the case of computers, we aren't trying to hide data, just make our access to it quicker or more convenient. Speed sometimes come from faster storage: ram is faster than a disk drive, cpu cache memory is faster than ram, and on chip pipelines are faster still. Other times it comes from using indexes (often hashes), and very often both are used.

Levels of cache

There are multiple levels of caching going on in any modern computer system. Consider just a simple "cat /etc/termcap". There's so much going on there I'll probably miss something, but we'll give it a try:

  • The cat executable needs to be loaded into memory so that it can run. Before that can happen, we have to find "cat". The places to look are stored in the PATH variable, which is a cache. To look within a given PATH, directories have to be searched and the OS keeps a cache of directory names to help that process. It's the "namei" cache, and examining its performance with sar (if you have it) is one of many ways you can see when people actually start working (as opposed to when they logged in). As people actually start accessing files, cache misses will likely show up in the stats.

  • There's an inode cache that will be used once the desired inode number is determined. That helps tell us where the data blocks of the "cat" executable actually are.

  • The kernel's buffer cache will be consulted for "cat"'s data blocks, and because of read-ahead, some data will come from that cache even if it wasn't there to begin with.

  • The CPU has its own cache of data; multiple levels of caches, actually, so anything that is read, code or data, will move through those caches too.

  • All of "cat" isn't actually loaded into memory; only enough of it to get started is actually brought in. The virtual memory system will pull in more of it as it is needed, but it also may put parts of it (or even all of it under the worst conditions) out to swap, which of course is yet another cache.

  • As "cat" uses shared libraries, those are probably in cache and (depending on the OS) there may be a special cach for that code.

  • Before cat can read the file, it has to find it. The namei cache will be used again. So of course will the cpu caches, and the buffer cache, and the inode cache.

  • Once it has located the actual inode, cat will start requesting data blocks from the disk. Each access will be checked against the kernel's buffer cache again, and the cpu caches will do their part as usual.

    On most Unixes, the file system buffer cache is a tunable parameter that you can often adjust for better performance. If you have lots of memory that isn't otherwise being used, this can be a smart move. On Linux, however, most of unused memory is automatically assigned for that use. That sometimes makes people nervous, because they see (from 'cat /proc/meminfo' or from 'top') that most of their memory is in use. There's nothing to be concerned about, because as soon as anything needs memory, the system will give it up from the buffer cache.

  • Cat will, of course do its own buffering, which is a cache. Data will be stored away, and sent to stdout when it can be.

No doubt I've forgotten something, but you probably get the idea: there are a LOT of caches in use. Maintaining those caches and keeping them accurate is the hard part. If good chunks of /etc/termcap are in cache, and some other program now modifies that file, future requests for the affected data blocks need to return the correct bytes. Keep in mind that the writing of data is buffered too, so this has to be very carefully managed. It would be unusual for a program like "cat" to care about data changes invalidating what it has in its own buffers, but some programs do have to worry about such things.

CPU caches have their own data consistency issues. While part of that housekeeping is up to the OS, on multiprocessor systems lower level hardware support has to be involved too. The biggest problem is that each CPU has its own cache(s) and consistency simply has to be maintained between them (a very good book that will tell you more than you may want to know about this is reviewed at Unix Systems for Modern Architecture).

Why so much caching?

Caching becomes more important as larger disparities in hardware speed occur. CPU speed exceeded ram speed quite some time ago: without deep caches, modern processors would be constantly stopping waiting for ram to provide the next instruction. While not completely accurate and blatantly ignoring some rather important details, a 100 Mhz CPU would need 10 nanosecond memory if you weren't caching. That's a lousy 100 Mhz CPU - what we run nowadays is just a little bit faster, right? But our ram hasn't kept up at all. Without lots of caching, these gigahertz CPU's would be pointless (in many applications, they are pointless anyway).

Of course disk drives are so slow that they'd seem nearly motionless from the CPU's point of view.

Network caching

When we bring network access into the picture, even more caches and buffers come into play (some might argue that a buffer isn't a cache, but in fact it is: it's just more temporary). Your web browser caches the very bytes you are looking at now, and both your machine and my server went through the various caches described above to get that data on the screen to start with. Clusters and other forms of parallelism maintain caches between themselves also. Like the problems of CPU caches touched on above, maintaining coherency of data without sacrificing speed is likely to be a major part of the design goals.


You might say that the opposite of caching is archiving: moving less frequently accessed data to a slower or less convenient location. However, that is a form of caching too: the archival cache is more convenient in that it doesn't sap more expensive or more limited resources. There are archival systems that are at least somewhat transparent to the user: if a resource is not available from cache memory, and isn't on disk, it will be automatically loaded from tape or cdrom or wherever else it might be stored. The virtual memory system concept is extended to the disk, though the wait time might be quite long.

You might also be interested in cache data corruption and Invalidating the Linux buffer cache.

Got something to add? Send me email.

(OLDER)    <- More Stuff -> (NEWER)    (NEWEST)   

Printer Friendly Version

-> -> Understanding Caches

Increase ad revenue 50-250% with Ezoic

More Articles by

Find me on Google+

© Tony Lawrence

Kerio Samepage

Have you tried Searching this site?

Unix/Linux/Mac OS X support by phone, email or on-site: Support Rates

This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more.

Contact us

The whole thing that makes a mathematician’s life worthwhile is that he gets the grudging admiration of three or four colleagues. (Donald Knuth)

This post tagged: