rsync is a CLI tool that covers various use cases. Transfering data, creating backups or archives, mirroring data sets, integrity checks, and many more.
Reference for this article: rsync version 3.2.7 and two Ubuntu 22.04 LTS machines.
If you want to transfer files to a remote host, rsync must be installed on both sites, and a connection via SSH must be possible.
Side note: rsync can be used via rsh
or as a daemon/server over TCP873, but I won’t cover those in this article and concentrate on the transfer over SSH
Basic File transfer #
You can transfer files locally, from local to a remote host or from a remote host to your local machine. Unfortunately, you can’t transfer files from one remote host to another remote host.
The used syntax for rsync is the following:
rsync [options] source destination
- The syntax for the remote-shell connection is:
[user@]host:/path/to/data
- Example of transferring a directory to a remote host:
rsync ./data user@192.0.2.55:/home/user/dir/
Side note: ./data/
copies only the files in the directory, ./data
copies the directory as well.
Rsync does not update or preserve metadata like ownership or timestamps of an item. More information about this is in the metadata section below.
It is highly recommended to use various options with rsync to get the results you want.
- Common options:
--recursive
/-r
# copy directories recursively--ipv4
/-4
# use Ipv4--ipv6
/-6
# use Ipv6--human-readable
/-h
# output numbers in a human-readable format--quiet
/-q
# decreases the output, recommended for automation like cron--verbose
/-v
# increase the output of information--archive
/-a
# rescursive + keeps all the meta data. Further information in the ‘metadata’ section
Specify a different SSH port #
The default TCP port for SSH is 22
but some servers listen on another port. That is not a problem, and you can tell rsync to connect to another port:
-e "ssh -p 2222"
# connection to TCP2222 instead of TCP22
Mirroring data #
You can simply mirror a directory to or from a remote host with the --delete
option. Rsync compares the source and destination directories, and if it finds files in the destination directory that are missing in the source, it will delete those to keep both sides the same. Please use it with caution and start with a dry run.
kuser@pleasejustwork:~/9_temp/rsync$ rsync -ah --delete --itemize-changes ./data user@192.0.2.55:/home/user/
sending incremental file list
*deleting data/small-files-7
[...]
Deleting source files after transfer #
The option --remove-source-files
- as the name already implies - removes all data after transferring the data to the destination. Please use it with caution and start with a dry run.
Update-behaviour #
There are some options to make sure that rsync does not overwrite data on the destination.
- Examples:
--update
/-u
# don’t update files that are newer on the destination--existing
# don’t create new files on the destination--ignore-existing
# don’t update files that exist on the destination--size-only
# only update when the size changes, but not the timestamp
This can be helpful when the files are used or modified by another application and you don’t want to overwrite anything.
Item Metadata #
As mentioned before, rsync does not preserve the media data of a file or directory. You can set various options to decide what you keep.
- Your options:
--perms
/-p
# permissions--owner
/-o
# owner--group
/-g
# group--times
/-t
# modification time--atimes
/-U
# access time--crtimes
/-N
# create time-A
# ACLs-X
# extended attributes
One of the most common options is --archive
/-a
, which will preserve all metadata and add recursing. It is in fact a shortcut for -rlptgoD.
It additionally preserves symbolic links, special and device files.
You can use the --no-*
syntax to remove single attributes like --no-perms
.
Exclude directories and files #
Rsync makes it easy to exclude files and directories. I’ll show you some examples in the following list.
- Example:
--exclude "*.iso" --exclude "*.img"
--exclude={"/tmp/*", "/etc/*"}
You can use --exclude-from=
to reference a file with a list of exclusions to make it more manageable.
Every line is one exclusion and line starting with ;
or #
are interpreted as commend and are getting ignored:
--exclude-from='/exclude.txt'
$ cat exclude.txt
.git
*.iso
# Temp
/tmp
/cache
Exclusion by file size
You can exclude files from being transferred if they’re too small or too large with --max-size=
and --min-size=
:
- Examples:
--max-size=500m
# max file size of 500 Mibibyte--min-size=5kb
# min file size of 5 kilobyte- Common scheme:
b byte
k kilo/kibi
m mega/mebi
g giga/gibi
t tera/tebi
p peta/pebi
Single letter or three letters ending with ib
like kib
tells rsync to use the Binary Prefix (multiplied by 1024) - kibibytes, and two letters like kb
tells rsync to use the Decimal Prefix (multiplied by 1000) - kilobytes.
Limit transfer bandwidth #
Sometimes, it is necessary to limit the transfer speed of rsync. You can do it with --bwlimit=
, which uses KB/s by default.
- Some examples:
--bwlimit=100
# Limits bandwidth to 100 KB/s--bwlimit=250k
# Limits bandwidth to 250 KB/s--bwlimit=1m
# Limits bandwidth to 1 MB/s
Data Compression #
You can choose to compress your data transfer which is great for slow connections. You can choose to activate compression with --compress
/-z
and rsync will choose a method for you if you do not specify a method that is compatible with the server side.
You can check the available algorithms with rsync --version
:
$ rsync --version
rsync version 3.2.7 protocol version 31
[...]
Compress list:
zstd lz4 zlibx zlib none
[...]
You can choose the compression algorithm with ----compress-choice=
/--zc=
.
Besides the algorithm, you can choose the compression level with --compress-level=
/--zl=
. Every algorithm has its own list of levels, and it is recommended to look them up.
Side note: you can choose --zl=999999999
to get the maximum compression no matter what algorithm you choose as rsync limits this value silently to the max limit.
Showing Transfer Progress #
By default, rsync does not show any progress at all.
$ rsync -ah ./data user@192.0.2.55:/home/user/
> nothing
With -v
you get a more verbose output and show at least the file that rsync is transferring at the moment:
$ rsync -avh --delete ./data user@192.0.2.55:/home/user/
sending incremental file list
data/
data/big-file
[...]
With --progress
you get the progress and transfer speed per file:
$ rsync -ah --progress ./data user@192.0.2.55:/home/user/
sending incremental file list
data/
data/big-file
17,92M 1% 2,59MB/s 0:06:27
To see only the total progress, use --info=progress2
:
$ rsync -ah --info=progress2 ./data user@192.0.2.55:/home/user/
4,42M 0% 4,07MB/s 0:04:10
The number behind progress
is the verbosity level: 0
=no output; 1
=per file; 2
=total.
This progress is better than nothing, but it can be vague as rsync is still checking the rest of the files for changes. With --no-inc-recursive
/--no-i-r
you can tell rsync to create the file list first and then start the transfer to make it more precise. That said, it delays the initial transfer.
You can use --stats
to get the transfer results at the end of the transfer.
Start a dry run #
Side note: the following method can be used to perform an integraty check. For example, you used another tool to transfer a large data set, and you want to check if everything was transferred right. You can double-check it with rsync and even correct things.
Depending on your use-case, there is a chance to delete data by making mistakes. To avoid that, we can use two features to check the steps rsync will perform in a secure way.
I am talking about --dry-run
/-n
and --itemize-changes
/-i
. The former performs a read-only run, and the latter shows you all the changes rsync will perform.
Let me show you an example, and don’t worry about the other options for now:
kuser@pleasejustwork:~/9_temp/rsync$ rsync -ah --delete --itemize-changes --dry-run ./data user@192.0.2.55:/home/user/
sending incremental file list
*deleting data/small-files-7
.d..t...... data/
0 0% 0,00kB/s 0:00:00 (xfr#0, to-chk=0/32)
<f.st...... data/small-files-1
<f+++++++++ data/small-files-14
cd+++++++++ data/new-data/
5 0% 4,88kB/s 0:00:00 (xfr#5, to-chk=0/32)
sent 583 bytes received 61 bytes 429,33 bytes/sec
total size is 1,05G speedup is 1.628.223,61 (DRY RUN)
- Explanation for this example of
--itemize-changes
: *deleting data/small-files-7
# deletes file on destination.d..t...... data/
# timestamp of directorydata
changed<f.st...... data/small-files-1
# changing size and timestamp on destination of filesmall-files-1
<f+++++++++ data/small-files-14
# file will be created in destinationcd+++++++++ data/new-data/
# new directory in source detected; will be created on destination- The syntax of this string is
YXcstpoguax
and is explained as follows: Y
# type of update performedX
# is the file typecstpoguax
# are the attributes that could be modified- Explanation of update types
Y
: <
# file is being SENT>
# file is being RECEIVEDc
# local change or creation of an item (directory, sym-link, etc)h
# item is a hard link.
# item is not getting updated*
# the rest of the output contains a message (e.g.deleting
)- Explanation of file types
X
: f
# stands for filed
# stands for directoryL
# stands for sym-linkD
# stands for deviceS
# stands for ‘special’, e.g. named sockets- Explanation for the attributes
cstpoguax
of an item: c
# checksums
# sizet
# timestampp
# permissionso
# ownerg
# groupu | n | b
#a
= access time ;n
= create time ;b
= both, access and create timesa
# ACL informationx
# extended attributes- Explanation of the status of the attribute:
- A letter means the attribute is being updated
.
# attribute unchanged+
# item newly created?
# change is unknown, working with old rsync versions
Transfer Logging #
Rsync does not log anything by default. There are multiple ways to do so.
You can create a log file with --log-file=
:
$ rsync -ah --info=progress2 --log-file=./rsync.log ./data user@192.0.2.55:/home/user/
29,43M 2% 3,13MB/s 0:05:18
[...]
and the logs would look like this:
$ cat rsync.log
2024/01/14 18:24:26 [647220] building file list
2024/01/14 18:24:26 [647220] cd+++++++++ data/
2024/01/14 18:24:34 [647220] sent 29630071 bytes received 585 bytes total size 1048576005
[...]
You can modify the name of the logs, for example, by adding a timestamp. That is great for automation like daily cron jobs.
$ rsync -ah --info=progress2 --log-file=./rsync-`date +"%F-%I%p"`.log ./data user@192.0.2.55:/home/user/
32,28M 3% 4,84MB/s 0:03:24 ^C
[...]
$ ll
[...]
-rw-r--r-- 1 user user 577 Jan 14 18:31 log-2024-01-14-06.log
[...]
Another option is to save your console output to a log file like this:
rsync command >> ./rsync.log
This is a quick and dirty version.
Rsync provides a large set of logging options and lets us decide what to show and hide. As it is out of the scope of this article, I won’t go into detail, but I wanted to share the --info=help
output to give you an idea of the options.
$ rsync --info=help
Use OPT or OPT1 for level 1 output, OPT2 for level 2, etc.; OPT0 silences.
BACKUP Mention files backed up
COPY Mention files copied locally on the receiving side
DEL Mention deletions on the receiving side
FLIST Mention file-list receiving/sending (levels 1-2)
MISC Mention miscellaneous information (levels 1-2)
MOUNT Mention mounts that were found or skipped
NAME Mention 1) updated file/dir names, 2) unchanged names
NONREG Mention skipped non-regular files (default 1, 0 disables)
PROGRESS Mention 1) per-file progress or 2) total transfer progress
REMOVE Mention files removed on the sending side
SKIP Mention files skipped due to transfer overrides (levels 1-2)
STATS Mention statistics at end of run (levels 1-3)
SYMSAFE Mention symlinks that are unsafe
ALL Set all --info options (e.g. all4)
NONE Silence all --info options (same as all0)
HELP Output this help message
Options added at each level of verbosity:
0) NONREG
1) COPY,DEL,FLIST,MISC,NAME,STATS,SYMSAFE
2) BACKUP,MISC2,MOUNT,NAME2,REMOVE,SKIP
Most recent Articles: