Re: Ceph remote disaster recovery at PB scale

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello

I will speak about cephfs because it what I am working on

Of course you can do some kind of rsync or rclone between two cephfs
clusters but at petabytes scales it will be really slow and cost a lot !

There is another approach that we tested successfully (only on test not in
prod)

We created a replicated cephfs data pool (replica 3) and spread it on 3
datacenters Beauharnois (Canada), Strasbourg (France) and Warsaw (Poland)
So we had 1 replica per datacenter

Then only the cephfs metadata pool was on ssd (nvme) close to the end user
(On Strasbourg (France))

Same for Mon and Mgr (Also in Strasbourg) (in fact only cephfs data was
spread geographically)

We had high bandwidth and high latency (of course) between the datacenter
but it worked surprisingly well

This way you can lose up to two datacenters without losing any data (more
if you use more replicas). You just have to backup (Mon and CephFS metadata
witch are never a lot of data)

This strategy is only feasible for cephfs (has it is the least IOPS
demanding)

If you need more iops then you should isolate the high iops demanding
folder and run it on an separated pool locally on ssd

All the best

Arnaud

Leviia https://leviia.com/en

Le ven. 1 avr. 2022 à 10:57, huxiaoyu@xxxxxxxxxxxx <huxiaoyu@xxxxxxxxxxxx>
a écrit :

> Dear Cepher experts,
>
> We are operating some ceph clusters (both L and N versions) at PB scale,
> and now planning remote distaster recovery solutions. Among these clusters,
> most are rbd volumes for Openstack and K8s, and a few for S3 object
> storage, and  very few cephfs clusters.
>
> For rbd volumes, we are planning to use rbd mirroring, and the data volume
> will reach several PBs. My questions are
> 1) Rbd mirroring with Peta  bytes data is doable or not? are there any
> practical limits on the size of the total data?
> 2) Should i use parallel rbd mirroring daemons to speed up the sync
> process? Or a single daemon would be sufficient?
> 3) What could be the lagging time at the remote site? at most 1 minutes or
> 10 minutes?
>
> For S3 object store, we plan to use multisite replication, and thus
> 4) are there any practical limits on the size of the total data for S3
> multisite replication?
>
> and for CephFS data, i have no idea.
> 5) what could be the best practice to CephFS disaster recovery scheme?
>
>
> thanks a lot in advance for suggestions,
>
>
> Samuel
>
>
>
>
> huxiaoyu@xxxxxxxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux