> Your parallel rsync job is only getting 150 creates per second? What > was the previous throughput? I am actually not quite sure what the exact throughput was or is or what I can expect. It varies so much. I am copying from a 23GB file list that is split into 3000 chunks which are then processed by 16-24 parallel rsync processes. I have copied 27 of 64TB so far (according to df -h) and to my taste it's taking a lot longer than it should be doing. The main problem here is not that I'm trying to copy 64TB (drop in the bucket), the problem is that it's 64TB in tiny, small, and medium-sized files. This whole MDS mess and several pauses and restarts in between have completely distorted my sense of how far in the process I actually am or how fast I would expect it to go. Right now it's starting again from the beginning, so I expect it'll be another day or so until it starts moving some real data again. > The cache size looks correct here. Yeah. Cache appears to be constant-size now. I am still getting occasional "client failing to respond to cache pressure", but that goes away as fast as it came. > Try pinning if possible in each parallel rsync job. I was considering that, but couldn't come up with a feasible pinning strategy. We have all those files of very different sizes spread very unevenly across a handful of top-level directories. I get the impression that I couldn't do much (or any) better than the automatic balancer. > Here are tracker tickets to resolve the issues you encountered: > > https://tracker.ceph.com/issues/41140 > https://tracker.ceph.com/issues/41141 Thanks a lot! _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com