Basics: rsync HOWTO

Sometimes people from the Windows world think of rsync as just a tool for synchronizing laptop files, and although it can be used for that (see Using rsync to update laptop by Dirk Hart), it's also a general purpose copying tool that is worth learning about.

If you want something that is more like Briefcase, you might check out Unison. I have not personally used this, but it looks interesting.


Important: rsync is not quite analagous to Windows Briefcase. You can get similar functionality by some careful double invocation, but there are conditions that really can't be handled well. Of course that's true for Briefcase also, and for the same reasons.


You can learn quite a bit about rsync and how it works right on your own machine: no network necessary. That's actually a good way to learn: it's quick, and you can easily see the results.

Very under utilized

Like so many other Unix tools, rsync is often used at a very basic level without taking advantage of its more powerful features. Of course, because it is powerful and complex, people sometimes make the opposite mistake: thinking rsync is going to do something that it doesn't do. Trust me: I've made both those errors.

But simple is of course the place to start. So let's create a couple of directories to work with. I'll put mine in /tmp, but you can do whatever you find convenient.

cd /tmp
mkdir a b
date > a/froma
date > b/inb
rsync a/* b
 

The "b" directory now has been updated with files from a. That's as simple as you can get, but it's really no different than a copy in this case. Notice that a has NOT been updated with files from b. That's just like copy, in spite of the "sync" in the name.

But if we were doing this across a network, with either b or a on a different machine, even this simple invocation does have advantages over an rcp or scp. On a local (same machine) copy this will not happen, but over a network, rsync will transfer only the parts of a file that have actually changed. This is powerful stuff for large files and slower connections. It means that a giant log file only actually transfers the newest line. The algorithm that accomplishes this uses a "rolling checksum" and is well described at http://olstrans.sourceforge.net/release/OLS2000-rsync/OLS2000-rsync.html

If you were transferring to a remote machine, your syntax would be:

rsync -e a/*  user@otherbox:/tmp/b
or
rsync -e user@otherbox:/tmp/a/*  b
 

It's also possible to run an rsync server on the recieving machine. You use double colons if that's the case: otherbox::/tmp/a/*


That would use ssh for the actual copying, if you have to use rcp you would leave off the -e and the user name.

Links

Let's make some more files in a:

ln a/froma  a/lnfroma
ln -s a/froma  a/symlnfroma
rsync a/* b
 

We get an error message saying that the "symlnfroma" was skipped, but it looks like the other one copied. If you look more closely though, you'll see that it may not have done what you wanted: the files "froma" and "lnfroma" over in b are not hard links any more. Let's try again:

rsync -lH a/* b
 

That copied the symlink (-l) and fixed the hardlink (-H). Notice that the symlink does NOT refer to the file in b: it points to "a/froma". If you were copying to a directory of the same name in some other hierarchy or on another machine, that would be exactly what you would want.

Archive flag

Most of the time, rsync is used with -a (archive), which combines a number of other options:

  • -r recurse into subdirectories
  • -l copy symlinks as symlinks
  • -p retain file permissions
  • -t retain file time stamps
  • -g retain group ownership
  • -o retain owner
  • -D preserve devices

Note that -H is not included, and that's because it can be time consuming to figure out hard links. Expect rsync to run longer if you need to use -H. Also realize that ownership and group changes, as well as device file copying, need root permissions.

Other flags

Another often used flag not included in -a is -u, which says not to overwrite newer files. If we wanted to do a Briefcase style synchronization, we need -u, and we need to do the rsync in two directions:

rsync -aHu a/* b
rsync -aHu b/* a
 

and that is no good if files were updated in both places (that's not an easy problem for any such program though).

Less oftenly used is the -b flag, which creates backups of files it copies:

rsync -aHub a/* b
 

If you have followed along with the examples as given, that should have made no apparent change in a or b. But try this:

date > a/froma
rsync -aHub a/* b
 

Now b will have its older "froma" backed up to froma~. You can change the suffix, and you can have the backup files put in a different directory if you like.

Sometimes you want to delete files that no longer exist in the source:

rm a/inb
rsync -aHubv --delete a/* b
 

That would probably be desirable for our Briefcase synchronization also.

A very useful flag when testing rsync is "-n". This just shows you what would be done (add -v if you want to see actual file names) without actually transferring anything:

rm b/*
rsync -aHubnv --delete a/* b
 

Nothing will actually be transferred to b, but you will see what it would do. That can also be useful for other things: it can be used to verify the integrity of files thus providing a function similar to Tripwire. Let's say you copy files from one machine to another. You could then use rsync with the -n option on the "safe" machine. If nothing has changed, the files are still untouched.

This is NOT really equivalent to products like Tripwire but it can be useful in some situations.

Compression

If you have large files and slow links, -z will compress data.

More in the manual

There are even more esoteric flags; check the man page if you have more unusual needs. Whatever you need, it's probably there. Rsync is a powerful tool that is well worth learning.





9 comments




More Articles by

Find me on Google+

© Tony Lawrence





In the under utilized section the syntax for the rsync -e is wrong you need to specify the command.

rsync -e ssh ... for example.


---December 10, 2004
***Alex Gill wrote:
Under the "Links" section,
ln -s a/froma a/symlnfroma
This left me with a broken link which pointed to "froma" in the "a/a/" directory (that dir doesnt exist). Instead I used:
ln -s froma a/symlnfroma
***





Tue Apr 19 22:23:51 2005: 356   anonymous


The following command works better than the examples shown because it also synchronizes files beginning with a period ('.'):

rsync -aHv a/ b

END OF COMMENT



Sun Nov 26 20:55:14 2006: 2657   anonymous


I commonly use `rsync -PHSav`. For a two-sided rsync (rsync host1:a/ host2:b/; rsync host2:b/ host1:a/) try using the tool "unison".



Sun Nov 26 21:39:14 2006: 2658   TonyLawrence

gravatar
Unison is bidirectional (
(link) ) but it is more trouble to set up..



Thu Sep 4 21:37:36 2008: 4522   bruceg


A little tip to help save some hair pulling... In many of the examples I have seen for setting up a /etc/rsyncd.conf file, with SAMBA like share syntax, many of them omit the 'uid' setting. I have recently been setting up a lot of new servers, with a *lot* of data, and using rsync really cuts down on the cut-over time. However, if you want the permissions to be the same, the /etc/rsyncd.conf file will need to have 'uid = root' as an option, or the permissions will be changed on the "server" end to the process that rsync is running as, usually 'nobody'.

What we did before rsync, and upgrades is waste a lot of time waiting for 'tar' or 'cpio' or whatever finish. With rsync, we can start the data transfer in the middle of the day to get the bulk of the stuff moved, then when cut-over time comes, lock everyone out, do an rsync which should be lighting fast, bring down the old server, and bring up the new server with the correct IP and hostname, and you are off and running in no time! I can remember telling users (back in the slow network/QIC tape days), "Yea, we should be down for about 6-8 hours while we move the data to the new server. You won't be able to login". Now, cut-overs last about 20 min, which includes the bringing down/up of the servers. (Of course, if there is something you didn't think of, you can add an hour or two to that number) But that's not rsync's fault, is it? :-)

Bruce Garlock



Sat Jan 17 07:07:46 2009: 5178   David

gravatar
Thanks to all for this info. I now have got rsync to work and I am manually backing up my desktop to my laptop and vice-versa. However, I want to use CRONTAB to automate. How can I get some output to see what happened?



Sat Jan 17 10:58:31 2009: 5179   TonyLawrence

gravatar
I assume you have some script you call from crontab. Simply add

>> /whereever/filename

after your script name. If you want error output also

2>&1 >> /whereever/filename

See the various articles here about "cron" and "stderr" if you don't follow that.





Sat Jan 17 11:01:29 2009: 5180   TonyLawrence

gravatar


Specifically:

(link) (stderr) (link) (cron) (link) (cron)



Sun Jan 18 00:36:22 2009: 5181   David

gravatar
Thanks, Tony. I'll check these out.

I'm not using a script - just 'crontab' to create my cron details.



Sun Jan 18 02:10:03 2009: 5182   TonyLawrence

gravatar
It doesn't matter how yoou create the crontab - just have the output of your command or script redirect to wherever you want it.

With Linux, you can also set the file to send the output to mail. See the man or info page for crontab


------------------------
Kerio Samepage


Have you tried Searching this site?

Unix/Linux/Mac OS X support by phone, email or on-site: Support Rates

This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more.

Contact us

privacy policy