APLawrence.com -  Resources for Unix and Linux Systems, Bloggers and the self-employed

Backing up Virtual Machines

I had a conversation with a client yesterday about virtualization and backup. As there was some definite confusion in that conversation, I realized other people may need some orientation.

Today's VM systems can use "snapshots" to effectively capture an image of the VM at a moment in time. Microsoft calls this "Shadow Copy" but it's all the same idea: copy on write. The snapshot software creates a file that looks like a copy of the VM image, but it isn't. At that moment, it's actually working like a hard link: it is pointing to exactly the same disk blocks as the original file. If some process writes data to any block in the original, the target block is first copied to a new block that the snapshot file will use.

At any point in time, the snapshot file inode points to a mix of bocks: some are the same blocks that are still pristine in the original file and some are new blocks that have copies of that files data before it was overwritten.

This copy on write scheme leads to a few important facts. First, the snapshot will always represent the original file at the moment in time that it was created. Because no data is actually copied when the snapshot is first created, that can take place very quickly - the running system will be frozen briefly, but it doesn't take long. However, the necessary copy on write will have some impact on the VM's performance.

The snapshot may take up far less disk space than would be required for an actual copy. Over time, as new data is written to the original, the snapshot will need more new disk blocks and eventually could end up consuming as much space as its source.

There are two uses for these snapshots. One is to be able to quickly roll back to a particular point in time. The other use is for backup, and that's where the confusion comes in.

The snapshot file itself is a backup of sorts, but in this context its purpose is to allow you the leisure of copying to other media (tape, another disk, another directory, whatever) without having to be concerned about data changing during the process. You backup or copy the snapshot file which is always frozen in time when it was created.

That's great, but for most small business software it does NOT eliminate the necessity of stopping user activity. It cuts it down: creating a snapshot usually is literally a matter of a second or two. But it does not eliminate it.

Let's imagine an accounting application where a user has just entered a new invoice. There are several things that the software needs to do, but of course each action needs to be performed serially. In the midst of that, a snapshot of the data is requested.

The snapshot process will, of course, flush any pending disk writes, but the application software may have only asked to write some of what it needs to write to finish the transaction. You've frozen the data, but there are still things the application hasn't gotten to yet. The data may be corrupt.

Application software may be ready to deal with that situation. If you don't know that it is, you MUST get users to be inactive before creating the snapshot. For small businesses that typically aren't doing any user input during the backup window, that's no problem. For larger businesses it may be - but of course they are more apt to have more sophisticated application software that can roll back or complete partial transactions.

The snapshot software may allow you to automatically call custom scripts to shut down databases or do any other "pre-freeze" and "post-thaw" work. For example, see VMware's Virtual Machine Backup Guide for an overview of their snapshot/backup methods.

Snapshots do make backup easier, but you still have to consider your application software.



Got something to add? Send me email.





(OLDER)    <- More Stuff -> (NEWER)    (NEWEST)   

Printer Friendly Version

-> -> Backing up Virtual Machines:

2 comments



Increase ad revenue 50-250% with Ezoic


More Articles by

Find me on Google+

© Anthony Lawrence







Thu Apr 2 18:58:32 2009: 5963   BruceE

gravatar
On the restore side of live snapshots there is a simple analogy that I find helps people understand what the restore will be like. The restored snapshot is equivalent to powering off the VM at the time of the snapshot then powering it back on. The same sort of corruptions that can occur in such an event are the ones that must be dealt with with a live snapshot restore. RDBMS's are designed for that situation so they will recover to a consistent state. Of course, if your application on top of the RDBMS has integrity constraints that are not in the DB or if it stores some data outside of the DB, then the application will also need recovery logic built into it.

Of course, the VM Host could also add the ability to snapshot the machine state (RAM, CPU registers, cache, device state, etc.) using technology similar to what is done to achieve VM Host to Host live migration (like VMware ESX's Vmotion). This would allow the possibility of restoring the VM to its running state. Of course then you'd have to dealt with application to client integrity issues, as client software on desktops that connect to the application on the VM will likely have closed TCP connections while the recovered machine will not have. Of course, such a client-server failure should be built into the application as well.



Thu Apr 2 20:20:35 2009: 5964   TonyLawrence

gravatar
Ayup - and given all that, unless you double-dog know that you can safely do all this, it's better to just get everybody off for the few seconds it takes..

------------------------
Kerio Samepage


Have you tried Searching this site?

Support Rates

This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more.

Contact us





Anyone who slaps a 'this page is best viewed with Browser X' label on a Web page appears to be yearning for the bad old days, before the Web, when you had very little chance of reading a document written on another computer, another word processor, or another network. (Tim Berners-Lee)





This post tagged: