Hello, If you have a quite large amount of data you can maybe try the Chorus from CLYSO. https://docs.clyso.com/blog/2024/01/24/opensourcing-chorus-project; Opensourcing Chorus project | Clyso GmbH docs.clyso.com https://github.com/clyso/chorus; clyso/chorus: s3 multi provider data lifecycle management github.com It's using rclone for the copying, but it has some tricks that can be handy with large amounts of data. Ondrej > On 12. 4. 2024, at 4:00, Vladimir Sigunov <vladimir.sigunov@xxxxxxxxx> wrote: > > Hello, > I used to use rclone for data synchronization between 2 ceph clusters and for a directional sync from AWS to Ceph. > In general, rclone is a really good and reliable pice of software, but could be slow with large amount of syncing objects. Large - 10^6+ objects. > As a disclaimer - my experience is 3 years old. Very likely rclone was improved since that time, and it definitely should be considered at least as POC. > You can offload rclone's sync operations by skipping hashes and other costly comparisons if this is an appropriate approach in your project. > Good luck! > Sincerely, > Vladimir. > > Get Outlook for Android<https://aka.ms/AAb9ysg> > ________________________________ > From: Casey Bodley <cbodley@xxxxxxxxxx> > Sent: Thursday, April 11, 2024 5:29:30 PM > To: James McClune <mcclune.789@xxxxxxxxx> > Cc: ceph-users@xxxxxxx <ceph-users@xxxxxxx> > Subject: Re: Migrating from S3 to Ceph RGW (Cloud Sync Module) > > unfortunately, this cloud sync module only exports data from ceph to a > remote s3 endpoint, not the other way around: > > "This module syncs zone data to a remote cloud service. The sync is > unidirectional; data is not synced back from the remote zone." > > i believe that rclone supports copying from one s3 endpoint to > another. does anyone have experience with that? > > On Thu, Apr 11, 2024 at 4:45 PM James McClune <mcclune.789@xxxxxxxxx> wrote: >> >> Hello Ceph User Community, >> >> I currently have a large Amazon S3 environment with terabytes of data >> spread over dozens of buckets. I'm looking to migrate from Amazon S3 to an >> on-site Ceph cluster using the RGW. I'm trying to figure out the >> most efficient way to achieve this. Looking through the documentation, I >> found articles related to the cloud sync module, released in Mimic ( >> https://docs.ceph.com/en/latest/radosgw/cloud-sync-module/). I also watched >> a video on the cloud sync module as well. It *sounds* like this is the >> functionality I'm looking for. >> >> Given I'm moving away from Amazon S3, I'm really just looking for a one-way >> replication between the buckets (i.e. Provide an Amazon S3 access >> key/secret which is associated to the buckets and the same for the Ceph >> environment, so object data can be replicated one-to-one, without creating >> ad-hoc tooling). Once the data is replicated from S3 to Ceph, I plan on >> modifying my boto connection objects to use the new Ceph environment. Is >> what I'm describing feasible with the cloud sync module? Just looking for >> some affirmation, given I'm not well versed in Ceph's RGW, especially >> around multi-site configurations. >> >> Thanks, >> Jimmy >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx >> To unsubscribe send an email to ceph-users-leave@xxxxxxx >> > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx