I'd probably do #4 myself.
Generate a list of all buckets/keys before switching over to the new storage and you have your todo list you can work against, just keep track of what you've done. This gives you a straightforward way to throttle the transfer as well, by scaling
workers who are doing the down/rm/up cycles. I'd think the most difficult part here would be managing the s3 credentials and ensuring clients are expecting per-key blips in availability.
That said, I'd also be pretty tempted to shoot for bluestore at the same time if I was shuffling that much data anyways. It would complicate things, as you'd need to swap entire osds, not just pgs.
FWIW my group has a tool we've released thats a nice cli wrapper around boto and makes things like multi part uploads and downloads as well as verifying uploads and downloads pretty straightforward. It also extends the default boto profile management
a bit, making it easy if you have a bunch of keys you need to use different credentials on. I think using it to script up option #4 could make it pretty straightforward. It's available on github or in pypi: https://github.com/bibby/radula https://pypi.org/project/radula/
Aaron
CONFIDENTIALITY NOTICE
|
_______________________________________________ Ceph-large mailing list Ceph-large@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-large-ceph.com