2005/03/01 Docfile,OLE2

© March 2005 Tony Lawrence

Microsoft applications use OLE2 file formats (though now XML is the coming thing). If you examined an OLE2 document, you'd find that it has the hex bytes "D0 CF 11 E0", which is why these are sometimes referred to as "DOCFILE" format. It's also called "Structured Storage", and you can find some info at MSDN

Of course Microsoft doesn't fully document these, but if you pressed on, you'd also find that a DocFile looks an awful lot like a FAT file system, and it's apparent that Microsoft apps use it in a similar manner, looking up sections much as you'd look for files or directories on a disk. That can get pretty interesting as was noted at https://www.advogato.org/article/754.html

Even funnier is the fact that although structured storage is able
to hold another storage inside (that happens when you embed Excel
sheet in Word document), yet PowerPoint will rather hold those
pictures/clip arts of your slides in one big chunk and manage them
by itself. Think of it: your presentation is on the disk (using
filesystem of your OS), the content is stored as structured storage
(another form of filesystem) and the pictures are yet inside another
container (alias mini filesystem). What a joy.

