I use rsync a lot in my multiple server environment. For example its handy to consolidate all the log files from Apache (which sometimes serves the same site from many servers) into the one place, or in my GeoIP BIND setup where BIND’s normal zone transfers no longer work and a tool like rsync is required to replace zone transfers, or relocating telephone recordings from Asterisk to another machine more suited to distributing them to those with access as it has more storage and bandwidth.
There are two problems with rsync & Debian – Debian doesn’t start the rsync daemon and provides no init.d script for startup at boot, and rsync has no facility for PID files when executing a file transfer.
The first solution is to add the following command to /etc/rc.local – this command can also be crontab’ed to ensure the rsync daemon is running – and this command is superior to normal initialization as it checks the PID file and won’t start rsync if its already running:
/sbin/start-stop-daemon -p /var/run/rsyncd.pid -u root -x /usr/bin/rsync -n rsync -S — –daemon
Further to that, crontab’ed rsync jobs can sometimes have a huge hit of data that will take some time to transfer. Or perhaps there are problems with the network causing slower than normal file transfers. There are many scenarios where the crontab’ed job would be executed again before a previous job has completed.
The solution again is to check PID files. My answer was to write a small shell script (below) which creates a PID file for rsync jobs based on a specified “name” (which probably should be associated with the rsync share name and/or the host specified in the job). The solution is good enough to allow runs of rsync every minute, or maybe even multiple times per minute.
To execute rsync with this script you would run:
./rsync-pid.sh “rsync –avz –compress-level=9 –delete –password-file=/path/to/password/file /path/to/local/data/* rsync://user@host/path/to/remote/data” processname
The script’s source code is as follows:
#!/bin/sh
if [ -e /var/run/rsync/$2.pid ]
then
pid=`cat /var/run/rsync/$2.pid`
ps=`ps auwx | grep rsync | grep $pid | grep -v grep | wc -l`
if [ $ps -eq 0 ]
then
/bin/rm /var/run/rsync/$2.pid
unset pid
fi
fiif [ ! -e /var/run/rsync/$2.pid ]
then
pid=`echo $$`
echo $pid >/var/run/rsync/$2.pid$1
/bin/rm /var/run/rsync/$2.pid
else
echo $2 is already running!
fi