There was a long series of comments at the article about mdfind that got very confused talking
about OS X metadata. I thought I'd try to straighten some of that out in a separate post - though honestly I'm still easily confused myself!
First, what metadata are we talking about? For an old Unix hand, the metadata is information stored in the inode: file size, permissions, pointers to datablocks, link counts.. that's traditional metadata.
However, there's more metadata today - not just in Unix systems, but especially in Mac OS X. There are extended permissions, acl's, xattributes, Spotlight related metadata.. it's very hard to ferret all this out of Google because similar terms are used for dissimilar features.
Macs had "resource forks" early on. OS X still has resource forks. but apparently Apple would like to move away from those. That's probably why things get so darn confusing: search for information on metadata and OS X and you'll find lots of pointers to things that talk about resource forks, but usually that's deprecated and doesn't usually apply to OS X.
Let's take Spotlight metadata first. These are specific keys that Spotlight indexes. For example, you can do things like this:
mdfind 'kMDItemFSSize > 20000000'.
mdfind 'kMDItemFinderComment == "script application wrapper"'
mdfind 'kMDItemTextContent == "*Seneca*" && kMDItemFSName != "*emlx"'
mdfind 'kMDItemTextContent == "*Seneca*" && kMDItemContentType != "com.apple.mail.emlx"'
How does Spotlight get the info to index? It asks an Spotlight Importer. This
BASICS OF SPOTLIGHT page
Once the Mac OS does kick-off the extraction of metadata from a file, it does so through a Spotlight Importer. Spotlight Importers are plug-ins for the Mac OS that a developer provides specifically for helping files created by their applications to be searchable within Spotlight. Spotlight crawls through its list of changed files, handing each one to the appropriate importer. The importers then read the files, compile a list of metadata, and then hand the metadata back to Spotlight. At this point, the changed file is available for searching within Spotlight.
OK, great, but where does the metadata that the importer supplies come from?
Apparently, that's up to the developer. Apple's Extracting Metadata from Documents says:
Avoid the use of external files to store metadata content. All critical metadata should be in the same file as the data. The system store of metadata should be considered volatile.
I want to quibble a little: if it's stored in the data file, it's really not metadata, is it? But never mind. Some apps do it that way. For example, ID3 tags. But other apps do not. For example. In my ~/Library/Caches/Metadata I found some interesting stuff. *Some* apps store Spotlight metadata there. I found:
$ ls ~/Library/Caches/Metadata
Billings Microsoft Safari
Camino Precipitate com.evernote.Evernote
If I look in Billings, I find this:
But obviously not all apps store their Spotlight related metadata there. Entourage does, as seen in this HOW DOES ENTOURAGE WORK WITH SPOTLIGHT? bit:
When you enable Spotlight indexing within Entourage, a "cache" file is created for each item within your Entourage database. If you have 100,000 e-mail messages in your Entourage database, 100,000 cache files will be created. If you want to see the cache files, you can find them within your Library/Caches/Metadata/Microsoft folder.
Each cache file contains all the metadata that will be needed for indexing by Spotlight. All changes within Entourage are reflected to the cache files. Create a new item and a new cache file will be created. Updated an item and its cache file will update. Delete an item and its cache file will be deleted. With all these changes, Spotlight receives file change notifications and eventually will ask the modified cache files to go through the import process using the Entourage Spotlight Importer.
But there's no iTunes folder there..
There are also defaults. If I create a text file with "date > file", an "mdls"
will show Spotlight keys:
kMDItemContentCreationDate = 2009-04-12 12:07:02 -0400
kMDItemContentModificationDate = 2009-04-12 12:07:02 -0400
kMDItemContentType = "public.data"
kMDItemContentTypeTree = (
kMDItemDisplayName = "file"
kMDItemFSContentChangeDate = 2009-04-12 12:07:02 -0400
kMDItemFSCreationDate = 2009-04-12 12:07:02 -0400
kMDItemFSCreatorCode = ""
kMDItemFSFinderFlags = 0
kMDItemFSHasCustomIcon = 0
kMDItemFSInvisible = 0
kMDItemFSIsExtensionHidden = 0
kMDItemFSIsStationery = 0
kMDItemFSLabel = 0
kMDItemFSName = "file"
kMDItemFSNodeCount = 0
kMDItemFSOwnerGroupID = 501
kMDItemFSOwnerUserID = 501
kMDItemFSSize = 29
kMDItemFSTypeCode = ""
kMDItemKind = "Plain text"
kMDItemLastUsedDate = 2009-04-12 12:07:02 -0400
kMDItemUsedDates = (
2009-04-12 00:00:00 -0400
Obviously the "date" command didn't create those. Spotlight won't even index that file (no extension), but it has some default keys just the same! See Spotlight, mdfind (Mac OS X Tiger searching) for more on that.
You can add metadata yourself and can modify one item of Spotlight's domain.
$ xattr -w mystuff "hello there" file
$ xattr -l file
mystuff: hello there
The only Spotlight related data you can modify is kMDItemFinderComment. You do that with GetInfo and after adding it, xattr shows this:
xattr -l file
0000 62 70 6C 69 73 74 30 30 5A 4D 79 20 43 6F 6D 6D bplist00ZMy Comm
0010 65 6E 74 08 00 00 00 00 00 00 01 01 00 00 00 00 ent.............
0020 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 ................
0030 00 00 00 13 ....
mystuff: hello there
Note that this gives us the clue as to where the data was stored, but I don't find a file with that "com.apple.metadata" name. I do find:
But those aren't related.
So what do we know? Well, we know it's up to the application responsible for a file to provide importer code. It's up to the same app to decide where to store metadata. Obviously, that implies that for some data that would be the across all files of this type, there's no need to store it anywhere - the importer could generate the response when Spotlight asks.
That's as far as I've gone.. maybe someone else can add more.
Got something to add? Send me email.
Increase ad revenue 50-250% with Ezoic
More Articles by Anthony Lawrence
Find me on Google+
© 2009-11-07 Anthony Lawrence