Hi!
I am having backup which worked perfectly fine on V14 VM.

Command:
rsync -a --partial --inplace --numeric-ids --stats --human-readable –delete

It do copy on nfs mounted dir (NAS which is very quick).

rsync runs perfectly fine on large amount small files (backup on web). Very same command backup mails (where large files exist). As soon rsync picks up large file >2GB,  CPU quickly boost up beyond 100 and system freezes and get unresponsive. Only way is to restart from VM host. Host machine don’t show anything strange in CPU usage, neither NAS shows anything strange.

System logs don’t show anything useful (or I don’t understand where to look), system log just interrupted without log.

Did some google search, noting useful could find.

My set up is:
Host: Win10 (RAM 32GB,  i7-6700K ),  WM Workstation 14., 2 guests on it – one V15 (this) other V14 (very little recourses).

Guest: LAMP 15

RAM: 8GB
Filesystem                       Size  Used Avail Use% Mounted on
udev                             3.9G     0  3.9G   0% /dev
tmpfs                            799M   12M  787M   2% /run
/dev/mapper/turnkeyvm-root       177G  123G   47G  73% /
tmpfs                            3.9G     0  3.9G   0% /dev/shm
tmpfs                            5.0M     0  5.0M   0% /run/lock
tmpfs                            3.9G     0  3.9G   0% /sys/fs/cgroup

No system manipulations made.

Any suggestions?

Forum: 
Jeremy Davis's picture

I did a quick google and found some similar reports (although not the same) but they were for much older versions of rsync. So I doubt they are related.

The fact that you said the exact same command works fine in v14.x suggests that there may be a regression in rsync. Although, having said that, I checked the Debian bug tracker and there doesn't appear to be anyone reporting a similar bug against rsync v3.1.2 (the version in Debian 9/Stretch / TurnKey v15.x). So that would suggest, that even if it is a regression, the circumstance must be pretty specific. Are there any other factors you think may be in play here? E.g. are they syncing to the same remote host? Are the same files being transferred? If not, perhaps try syncing across your 2 machines and see if that changes things. I.e. see if the problematic files are ok on v14.x and/or see if the ok files are a problem on v15.x.

I also suggest that you try increasing the verbosity of rsync and see if anything specific shows up. I.e. -v should give you more info than default, adding more v's should give even more info . I'm not sure how many v's it maxes out at, but the man page says this:

A single -v will give you information about what files are being transferred and a brief summary at the end. Two -v options will give you information on what files are being skipped and slightly more information at the end. More than two -v options should only be used if you are debugging rsync.

Regardless, it may be worth opening a bug against rsync on Debian. TurnKey is basically a customised Debian and we install rsync (plus most of the rest of the OS) directly from Debian repos. v14.x was based on Debian Jessie; v15.x is based on Debian Stretch. The rsync package maintainer(s) may have some further more specific guidance on troubleshooting and gathering further info.

If you've never reported a bug to Debian before, you might find it a bit bizarre (it's quite different to posting a bug on a web based bug tracker). Debian do provide a fair bit of documentation (e.g. see here, here and/or herre) but I just came across a video on YouTube which looks quite good (TBH I actually didn't watch it all - but it seemed pretty good).

Please post back on how your troubleshooting etc goes. I'll try to help out as much as I can.

You were right, there were one difference between deployment of v14 and v15. It was NAS itself, witch for V15 is brad new and super-fast connected with 1Gb line. As it appears, rsync by its nature tries to “eat as much it can” and for large files it boils I/O, hence CPU. And for VM it is even more dangerous, like in my case.

Solution is provided by rsync itself: use command - --bwlimit=KBPS (limit I/O bandwidth; KBytes per second).

This option allows you to specify a maximum transfer rate in kilobytes per second. This option is most effective when using rsync with large files (several megabytes and up). Due to the nature of rsync transfers, blocks of data are sent, then if rsync determines the transfer was too fast, it will wait before sending the next data block. The result is an average transfer rate equaling the specified limit. A value of zero specifies no limit.

 

Jeremy Davis's picture

Glad to hear that you managed to diagnose the issue and found a workaround. Thanks too for posting back. I'm sure that will help someone else. It's also good for me to be aware of! :)

Add new comment