Dear Arnaud, Thanks a lot for sharing your precious experience, and this mothod for cephfs disaster recovery is really unique and intriguing! Curiously, how do you do cephfs metadata backup? Should the backup be done very frequently in order to avoid much data loss? need any special tool to backup cephfs metadata? best regards, samuel huxiaoyu@xxxxxxxxxxxx From: Arnaud M Date: 2022-04-01 12:28 To: huxiaoyu@xxxxxxxxxxxx CC: ceph-users Subject: Re: Ceph remote disaster recovery at PB scale Hello I will speak about cephfs because it what I am working on Of course you can do some kind of rsync or rclone between two cephfs clusters but at petabytes scales it will be really slow and cost a lot ! There is another approach that we tested successfully (only on test not in prod) We created a replicated cephfs data pool (replica 3) and spread it on 3 datacenters Beauharnois (Canada), Strasbourg (France) and Warsaw (Poland) So we had 1 replica per datacenter Then only the cephfs metadata pool was on ssd (nvme) close to the end user (On Strasbourg (France)) Same for Mon and Mgr (Also in Strasbourg) (in fact only cephfs data was spread geographically) We had high bandwidth and high latency (of course) between the datacenter but it worked surprisingly well This way you can lose up to two datacenters without losing any data (more if you use more replicas). You just have to backup (Mon and CephFS metadata witch are never a lot of data) This strategy is only feasible for cephfs (has it is the least IOPS demanding) If you need more iops then you should isolate the high iops demanding folder and run it on an separated pool locally on ssd All the best Arnaud Leviia https://leviia.com/en Le ven. 1 avr. 2022 à 10:57, huxiaoyu@xxxxxxxxxxxx <huxiaoyu@xxxxxxxxxxxx> a écrit : Dear Cepher experts, We are operating some ceph clusters (both L and N versions) at PB scale, and now planning remote distaster recovery solutions. Among these clusters, most are rbd volumes for Openstack and K8s, and a few for S3 object storage, and very few cephfs clusters. For rbd volumes, we are planning to use rbd mirroring, and the data volume will reach several PBs. My questions are 1) Rbd mirroring with Peta bytes data is doable or not? are there any practical limits on the size of the total data? 2) Should i use parallel rbd mirroring daemons to speed up the sync process? Or a single daemon would be sufficient? 3) What could be the lagging time at the remote site? at most 1 minutes or 10 minutes? For S3 object store, we plan to use multisite replication, and thus 4) are there any practical limits on the size of the total data for S3 multisite replication? and for CephFS data, i have no idea. 5) what could be the best practice to CephFS disaster recovery scheme? thanks a lot in advance for suggestions, Samuel huxiaoyu@xxxxxxxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx