APLawrence.com -  Resources for Unix and Linux Systems, Bloggers and the self-employed

Why not differential backups, Internet backups, disk to disk??

© April 2002 Tony Lawrence
April 2002

I get this question frequently. It's usually triggered either because the tape device can't hold an entire backup set or because the time required for backup interferes with productive work. Most of the time this can be easily remedied by a larger or faster storage device, but someone is bound to bring up the idea of differential backups.

The idea is that you create a full backup that has everything, and from then on, you only backup the files that have changed. Presumably that's a smaller set of files and therefore this solves the space or time problems. Usually the full backup is refreshed on some schedule and the process starts again. There are variants on the theme; for example the differential may include all files that have changed since the last full backup rather than just those that have changed since the last differential. That sort of scheme eventually ends up with the differential containing any and all files that ever change, no matter how infrequently; the full backup is the source of everything else.

Often the term "Incremental" is used to describe what I call true differential. I'll use that term for the rest of this article. Remember that a Differential will always have everything that has changed since the last complete backup; an Incremental will only have files that have changed since the previous Incremental backup. Right after the full backup, an Incremental and a Differential would be exactly the same; after that they will probably contain different files. An Incremental CAN be smaller than a Differential but could never be larger.

Differential or Incremental backups always seems like a great idea to people who haven't experienced the negative aspects. Admittedly, there can be circumstances where you have no other choice, but consider these points:

Wherever possible, doing a complete, full backup every day is easiest and gives the most data redundancy. If you absolutely cannot do that, then the modified Incremental (everything modified since the last full backup) is better than true Incrementals. However, don't neglect having multiple full backups in either case.

By the way, my aversion to differential or incremental backups is based on many years of painful field experience. Although it is rare nowadays, not too many years ago I would be involved with drive failures about once a month: I have seen these problems for myself. I STRONGLY RECOMMEND FULL BACKUPS IF AT ALL POSSIBLE. Backup media gets larger and faster and cheaper ever year, so most people CAN do complete backups, and should.

What about Network Backups to another hard drive?

While attractive in principle, the time element isn't all that good and you also lose several important capabilities:

Consider this also: you have set up rsync or whatever to keep two machines up to date. Now you have a memory or motherboard problem on the main machine that scrambles database data. It's not bad enough to crash instantly, but it is bad enough to damage the database extensively. That bogus data will of course get transferred to the other machine: effectively a hardware problem on one box causes the identical problems on the other.

Sometimes the easiest way to fix such a problem is to go back in time to a point where the data was not corrupted. This may be because it's too corrupt to fix with ordinary tools but more often it's just because it is too difficult to figure out where all the problems are: the only sure solution is to revert to some previous state. The ONLY way to do that is to have multiple sets of removable backup that extend backup in time.

Remember, I'm not saying that having the backup machine is a bad idea. It's not, and it can be very convenient. But you need removable media SOMEWHERE.

There are now inexpensive removable hard drives. They are still a little expensive, but you CAN do this.

Removable media is still the intelligent choice for backup and will remain so until solid state, non-volatile disk drives are common, and I'm not even sure if it's a bad idea then.

Why not Internet backups?

The problem here is two-fold: one, you probably can't back up ALL your data because the connection isn't fast enough and two, you are depending on the Internet being available for restore. I do think Internet backup is a great adjunct to in-house removable media, but that's all it is.

Maybe you have multiple redundant T3 connections and can do this, but even then, I think you should have in-house removable media for utmost safety.

Your data is critical. Don't put it at risk.

Got something to add? Send me email.

(OLDER)    <- More Stuff -> (NEWER)    (NEWEST)   

Printer Friendly Version

-> Why not differential backups? Why not network backup? Why not hard drive backup?


Inexpensive and informative Apple related e-books:

Take Control of the Mac Command Line with Terminal, Second Edition

El Capitan: A Take Control Crash Course

Sierra: A Take Control Crash Course

Take Control of iCloud, Fifth Edition

Take Control of OS X Server

More Articles by © Tony Lawrence

Wed Sep 14 15:16:46 2005: 1086   anonymous

Nice Utopian view but not realistic in a large or enterprise environment, and you charge how much?

Wed Sep 14 17:49:03 2005: 1089   TonyLawrence

I don't think you read this article. I said, if you can't do it, either because of time or space, obviously you have to go to a differential or incremental scheme. But if you CAN, a full backup is a far better idea. There's nothing "Utopian" about it - either you can do full backups or you cannot. If you can, you should. Period.

Oh, and I charge $150.00 per hour. But opinions like this are free.

Wed Sep 14 22:26:10 2005: 1090   BigDumbDinosaur

There are now inexpensive removable hard drives. They are still a little expensive, but you CAN do this.

Any hard drive is an inferior alternative to tape or optical media (if the backup can fit therein). The primary reason we do backups is to protect ourselves from hard drive failure, which as Tony indicated in his article, used to be a routine problem some years ago. A hard drive is an inherently delicate device, and all it takes is one good hard knock to damage the mechanism and render the data inaccessible. While using a removable hard disk as a backup device is fairly convenient, it's not a final solution.

Nice Utopian view but not realistic in a large or enterprise environment, and you charge how much?

There's nothing Utopian about using the right methods. Just how valuable is your data to you anyhow?

As for what one might charge for one's services, I'm not sure what relevance that has to this article, since the information is free for the reading. I personally haven't worked with Tony, so I don't have an opinion on whether his rate is fair or not. However, I'd be willing to wager that with his range of experience in this business, he would be worth every cent of what he charges if your system went belly-up and you needed it back on-line in a jiffy.

In my case, my clients pay me well to keep their computing machinery greased and oiled, and none has ever complained about what I charge. If something were to go kaput and employees were suddenly unable to get anything accomplished, that would not be the best time to be pinching pennies. Which would be cheaper: hiring a high priced but very experienced technician who can come in, quickly size up the situation and promptly restore operation, or paying the salaries of numerous employees who are sitting around doing nothing while an inexperienced technician piddles around trying to figure out what's wrong?

Thu Oct 5 16:05:34 2006: 2505   anonymous

The weighting of this article is in the title but... As mr Utopia pointed out this isn't always a feasible in a corp environment. Corps will generally aim to have a centralised backup infrastructure and that means backing up over networks. Capacity is no longer the limiting factor, speed is. It maybe out of the articles scope but perhaps a reason to do differentials is to try and guarantee that a backup will complete before the data starts changing again. There’s also the issue of data that is changing 24/7. I agree with the gist of the article, but Utopia has a point.

Thu Oct 5 17:21:06 2006: 2506   TonyLawrence

Once again: I said, if you can't do it, either because of time or space, obviously you HAVE to go to a differential or incremental scheme. But if you CAN, a full backup is a far better idea. There's nothing "Utopian" about it - either you can do full backups or you cannot. If you can, you should. Period.

Fri Sep 18 16:49:10 2009: 6932   TonyLawrence

I wrote this over seven years agio and really nothing has changed.

A lot of people are running around pushing Internet backup. I think that's great as an adjunct to in-house, but as I explained above, it shouldn't be your only solution even if your data is small enough to stream out every night.

I still get the "another hard drive" and the "copy it to another machine" people too and while that can be convenient, it isn't BACKUP.

What I do get asked a lot is what software I recommend. Unfortunately, every damn backup app I've seen for Windows has caused me grief somewhere. I suspect this is Microsoft's fault more than the app vendors, but still: they are all very good at giving you pretty reports showing what they backed up and how long it took. If only they were equally good when it comes time to restore a dead box!

Because of the frailty, when it comes to Windows, my normal advice that you have to TEST full restores has to be even stronger: you have to test regularly and consistently. That's annoying and time consuming but my experience says you just can't poke your head in the sand and trust. I wish it were otherwise. With Microlite on Unix/Linux, I *know* that if I get a backup and have bootable restore media, I *can* rebuild. I wish Windows backup apps gave me the same confidence, but they do not.


Printer Friendly Version

Have you tried Searching this site?

This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more.

Contact us

Printer Friendly Version

In C++ it's harder to shoot yourself in the foot, but when you do, you blow off your whole leg. (Bjarne Stroustrup)

Linux posts

Troubleshooting posts

This post tagged:




Unix/Linux Consultants

Skills Tests

Unix/Linux Book Reviews

My Unix/Linux Troubleshooting Book

This site runs on Linode