First, certain Fox News reporters and some military types drive
me nuts mispronouncing this. It's "kash", not "kash-ay" (cachet,
which is pronounced that way, is related by origin but is quite
distinct in meaning) and that's all that needs to be said. Except
perhaps "Foo!" toward those who perpetuate this barbarism. I get
grumpy about this kind of thing: I also deplore the recent trend of
using "monetize" where "commercialize" is the proper word.
The barbarians are referring to weapons caches, and we'll be
looking at data caching (instructions just being a form of data
too), but it's the same idea: storing something where you can get
it when you need it. In the case of computers, we aren't trying to
hide data, just make our access to it quicker or more convenient.
Speed sometimes come from faster storage: ram is faster than a disk
drive, cpu cache memory is faster than ram, and on chip pipelines
are faster still. Other times it comes from using indexes (often
hashes), and very often both are used.
Levels of cache
There are multiple levels of caching going on in any modern
computer system. Consider just a simple "cat /etc/termcap". There's
so much going on there I'll probably miss something, but we'll give
it a try:
- The cat executable needs to be loaded into memory so that it
can run. Before that can happen, we have to find "cat". The places
to look are stored in the PATH variable, which is a cache. To look
within a given PATH, directories have to be searched and the OS
keeps a cache of directory names to help that process. It's the
"namei" cache, and examining its performance with sar (if you have
it) is one of many ways you can see when people actually start
working (as opposed to when they logged in). As people actually
start accessing files, cache misses will likely show up in the
- There's an inode cache that will be used once the desired inode
number is determined. That helps tell us where the data blocks of
the "cat" executable actually are.
- The kernel's buffer cache will be consulted for "cat"'s data
blocks, and because of read-ahead, some data will come from that
cache even if it wasn't there to begin with.
- The CPU has its own cache of data; multiple levels of caches,
actually, so anything that is read, code or data, will move through
those caches too.
- All of "cat" isn't actually loaded into memory; only enough of
it to get started is actually brought in. The virtual memory system
will pull in more of it as it is needed, but it also may put parts
of it (or even all of it under the worst conditions) out to swap,
which of course is yet another cache.
- As "cat" uses shared libraries, those are probably in cache and (depending on the OS) there may be a special cach for that code.
- Before cat can read the file, it has to find it. The namei
cache will be used again. So of course will the cpu caches, and the
buffer cache, and the inode cache.
- Once it has located the actual inode, cat will start requesting
data blocks from the disk. Each access will be checked against the
kernel's buffer cache again, and the cpu caches will do their part
On most Unixes, the file system buffer cache is a tunable
parameter that you can often adjust for better performance. If you
have lots of memory that isn't otherwise being used, this can be a
smart move. On Linux, however, most of unused memory is
automatically assigned for that use. That sometimes makes people
nervous, because they see (from 'cat /proc/meminfo' or from 'top')
that most of their memory is in use. There's nothing to be
concerned about, because as soon as anything needs memory, the
system will give it up from the buffer cache.
- Cat will, of course do its own buffering, which is a cache.
Data will be stored away, and sent to stdout when it can be.
No doubt I've forgotten something, but you probably get the
idea: there are a LOT of caches in use. Maintaining those caches
and keeping them accurate is the hard part. If good chunks of
/etc/termcap are in cache, and some other program now modifies that
file, future requests for the affected data blocks need to return
the correct bytes. Keep in mind that the writing of data is
buffered too, so this has to be very carefully managed. It would be
unusual for a program like "cat" to care about data changes
invalidating what it has in its own buffers, but some programs do
have to worry about such things.
CPU caches have their own data consistency issues. While part of
that housekeeping is up to the OS, on multiprocessor systems lower
level hardware support has to be involved too. The biggest problem
is that each CPU has its own cache(s) and consistency simply has to
be maintained between them (a very good book that will tell you
more than you may want to know about this is reviewed at Unix Systems for Modern
Why so much caching?
Caching becomes more important as larger disparities in hardware
speed occur. CPU speed exceeded ram speed quite some time ago:
without deep caches, modern processors would be constantly stopping
waiting for ram to provide the next instruction. While not
completely accurate and blatantly ignoring some rather important
details, a 100 Mhz CPU would need 10 nanosecond memory if you
weren't caching. That's a lousy 100 Mhz CPU - what we run nowadays
is just a little bit faster, right? But our ram hasn't kept up at
all. Without lots of caching, these gigahertz CPU's would be
pointless (in many applications, they are pointless
Of course disk drives are so slow that they'd seem nearly
motionless from the CPU's point of view.
When we bring network access into the picture, even more caches
and buffers come into play (some might argue that a buffer isn't a
cache, but in fact it is: it's just more temporary). Your web
browser caches the very bytes you are looking at now, and both your
machine and my server went through the various caches described
above to get that data on the screen to start with. Clusters and
other forms of parallelism maintain caches between themselves also.
Like the problems of CPU caches touched on above, maintaining
coherency of data without sacrificing speed is likely to be a major
part of the design goals.
You might say that the opposite of caching is archiving: moving
less frequently accessed data to a slower or less convenient
location. However, that is a form of caching too: the archival
cache is more convenient in that it doesn't sap more expensive or
more limited resources. There are archival systems that are at
least somewhat transparent to the user: if a resource is not
available from cache memory, and isn't on disk, it will be
automatically loaded from tape or cdrom or wherever else it might
be stored. The virtual memory system concept is extended to the
disk, though the wait time might be quite long.
You might also be interested in cache data corruption and Invalidating the Linux buffer cache.
Got something to add? Send me email.
Increase ad revenue 50-250% with Ezoic
More Articles by Tony Lawrence
Find me on Google+
© 2012-03-11 Tony Lawrence