Large File Migrations in OS X
After coming up short on some searches, I wanted to open up a simple discussion about file transfers in OS X.
What is the fastest and most reliable tool for migrating large volumes of data with OS X?
OS X is my client machine, there are a mix of large volumes / filesystems on a 10GbE network. I spend a lot of time moving large volumes of data around. They seem to be getting bigger and bigger by the year, especially with 4k and TV Broadcast content.
rsync was my tool of choice for a lot of things. cp and mv have their time and place. Finder is particularly taxing on CPU and unreliable for large volumes of data. If it fails you're starting over.
But they're all slow. Finder can pull 450-600mb/s speeds on average, but cli tools kinda idle out around 120-200mb/s. Maybe I'm passing bad arguments on commands. But none of those tools like it when you through a 10TB volume at them.
Anyways, just curious how people handle moving large amounts of data in OS X. I've never really seen an app or tool that did a perfect job. But I live in a bubble so it's quite possible I just don't know.
I haven't thought about this in awhile, so your question raised my curiousity. WHile we wait for a person using a better sync tool to show, have you tried
Not sure what it's. underlying engine is. I'm going to try it over the weekend.
Looking forward to an answer to this also!
[Will Duncan] "rsync was my tool of choice for a lot of things."+1
At work we use rsync to keep production and staging servers in sync but these are in the 100's of GB not TB range. It's still the unix tool of choice for large data synchronizations.
As a test, you might try doing a local rsync to another local disk on the same host on the loopback nic (127.0.0.1) and see what the throughput is. I assume that would bypass the network in loopback mode and give you a disk-to-disk performance measurement (i've never tried this but it sounds like it should work). If it's greater throughout that way, then the network is your limiting factor. If it's the same, then perhaps the storage system is the limiting factor. Rsync is pretty efficient so it shouldn't be the limiting factor.
If course, rsync is comparing checksums and only sending difference if the files already exist. If you don't need this capability, rcp will blindly remote copy without checking for existing files. This may cut down some of the overhead of checking first. I'm not sure how much overhead that is and if some of the files haven't changed then you definitely want to use rsync instead.
I haven't used that sync tool before. I'm mostly MacOS / Linux at work, and a split between rsync and robocopy, which looks to be the base of a lot of tools written for Windows.
I've spent a lot of time over the last few months transferring multiple hundreds of TBs of data, mostly over a 10GbE local network between OS X and NTFS RAIDs along with an object storage array.
rsync has definitely been the most reliable tool to date. But so much of the documentation, examples and search results are based on ssh'ing into a server where network speeds and disk IO is very slow.
Here is a good link that I came across. I am referring to the comments:
Since we're on a 10GbE local network, I opted to pass `-avhW` when dumping a drive from the field. I don't really have anyone to bounce this stuff off of, but passing the W in the arguments gets me closer to 180mb/s vs the typical 90-120 since it's not splitting up the file now. So a little bit of a speed boost it seems which adds up when you have a lot of data.
I've never been able to achieve anywhere over 180mb with rsync though, and drive to drive speed should be much much higher.
With Finder I'm still closer to 350mb/s, but it's my understanding that OS X Finder can do some bad things to large volumes given enough time and copies / writes to the volume. I can see 900mb/s with Finder in transfers to btrfs volumes, but those are newer and will probably degrade a lot as they get full.
I think what I need to learn to do is to run multiple rsync processes in parallel with each other. It's staying organized that's going to be the hard part.