Feedback/questions regarding cephfs-mirror

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Hello,

we're currently evaluating cephfs-mirror.

We have two data centers with one Ceph cluster in each DC. For now, the Ceph clusters are only used for CephFS. On each cluster we have one FS that contains a directory for customer data and below that there are directories for each customer. The customers access their directories via CephFS. Generally customers only have data in one data center.

In order to be able to quickly restore the service, we want to mirror all customer data in one DC to the Ceph cluster in the other DC. Then, in case one Ceph cluster becomes unavailable, we would do the following (let's say the clusters are cluster A and cluster B, and cluster A became unavailable):

- "Unmount" the broken mount on clients connecting to cluster A.
- Mount the customer directories from cluster B.
- Repair/restore cluster A.
- Break the mirror relation from cluster A to cluster B.
- Create a mirror relation from cluster B to cluster A (for the data that should be on A). - Ensure that cluster A is regularly updated with the current data from cluster B. - In a maintenance window: Unmount the directory on all clients that should connect to cluster A, sync all data (that should be on A) from B to A, break the mirror relation from B to A, have all clients mount the directories from A, create a mirror relation from A to B.


Setting up the mirroring looks like this:

# Cluster A: There are two FSs: fs-a for customer data, fs-b-backup for data mirrored from cluster B # Cluster B: There are two FSs: fs-b for customer data, fs-a-backup for data mirrored from cluster A

(cluster-a)# ls /mnt/fs-a/customers
customer1 customer2 customer3

(cluster-b)# ls /mnt/fs-b/customers
customer4 customer5 customer6

# Enable mirroring on both clusters
(cluster-a)# ceph orch apply cephfs-mirror
(cluster-a)# ceph mgr module enable mirroring
(cluster-b)# ceph orch apply cephfs-mirror
(cluster-b)# ceph mgr module enable mirroring

# Setup mirroring of fs-a from cluster A to cluster B
(cluster-b)# ceph fs authorize fs-a-backup client.mirror_a / rwps
(cluster-b)# ceph fs snapshot mirror peer_bootstrap create fs-a-backup client.mirror_a cluster-a

(cluster-a)# ceph fs snapshot mirror enable fs-a
(cluster-a)# ceph fs snapshot mirror peer_bootstrap import fs-a <token>
(cluster-a)# ceph fs snapshot mirror add fs-a /customers/customer1
(cluster-a)# ceph fs snapshot mirror add fs-a /customers/customer2
(cluster-a)# ceph fs snapshot mirror add fs-a /customers/customer3

# Setup mirroring of fs-b from cluster B to cluster A
(cluster-a)# ceph fs authorize fs-b-backup client.mirror_b / rwps
(cluster-a)# ceph fs snapshot mirror peer_bootstrap create fs-b-backup client.mirror_b cluster-b

(cluster-b)# ceph fs snapshot mirror enable fs-b
(cluster-b)# ceph fs snapshot mirror peer_bootstrap import fs-b <token>
(cluster-b)# ceph fs snapshot mirror add fs-b /customers/customer4
(cluster-b)# ceph fs snapshot mirror add fs-b /customers/customer5
(cluster-b)# ceph fs snapshot mirror add fs-b /customers/customer6

# Snapshots for the customer directories are created daily via schedule.

# Result: Customer data from A mirrored to B and vice-versa
(cluster-a)# ls /mnt/fs-b-backup/customers
customer4 customer5 customer6

(cluster-b)# ls /mnt/fs-a-backup/customers
customer1 customer2 customer3


In order to fail over from cluster A to cluster B, we do the following:

# Create clients for access to fs-a-backup on cluster B
# Mount customer directories from fs-a-backup on cluster B

# When cluster A is available again
(cluster-a)# ceph fs snapshot mirror remove fs-a /customers/customer1
(cluster-a)# ceph fs snapshot mirror remove fs-a /customers/customer2
(cluster-a)# ceph fs snapshot mirror remove fs-a /customers/customer3
(cluster-a)# ceph fs snapshot mirror disable fs-a

(cluster-a)# ceph fs authorize fs-a client.mirror_b_failover / rwps
(cluster-a)# ceph fs snapshot mirror peer_bootstrap create fs-a client.mirror_b_failover cluster-b
(cluster-a)# rmdir /mnt/fs-a/customers/*/.snap/*

(cluster-b)# setfattr -x ceph.mirror.info /mnt/fs-a-backup
(cluster-b)# ceph fs snapshot mirror enable fs-a-backup
(cluster-b)# ceph fs snapshot mirror peer_bootstrap import fs-a-backup <token>
(cluster-b)# ceph fs snapshot mirror add fs-a-backup /customers/customer1
(cluster-b)# ceph fs snapshot mirror add fs-a-backup /customers/customer2
(cluster-b)# ceph fs snapshot mirror add fs-a-backup /customers/customer3


Now, all clients connect to cluster B and cluster A is only used as a mirror destination. At some point we will have a maintenance window and switch all clients that should be on cluster A back to cluster A.


After testing this setup, there are a couple of issues that we have with the current cephfs-mirror implementation:

- The xattr ceph.mirror.info seems not to be removed when mirroring is disabled on a FS (haven't yet gotten around to debug why - looks like a simple bug). As shown above this can be workarounded by manually removing the xattr. - For us it would be convenient if a FS could be mirror source and destination at the same time. Then we wouldn't need to have two FSs. In that case for each mirrored directory the destination directory name would also need to be specified (e.g. mirroring fs-a:/customers/customer1 to fs-b:/backup/customer1). - Because now the mirroring between two FSs is always in one direction, it is not possible to mirror different directories in different directions. That would be really useful for a failback, because organizing one big maintenance window where all customers are failed back at once is very difficult. - The ceph.dir.layout.pool_namespace xattr is not mirrored - we use this attribute to separate the RADOS objects of different customers. Workaround is to manually create the customer directories on the mirror destination and set the xattr before mirroring starts. - If files are deleted on the mirror destination, they are not re-created and there is no notification that there are now files missing. - Mirroring back to the source cluster after a failover requires that all snapshots are deleted on the (old) source FS and then everything is mirrored even though most of the data is already there. The same happens when failing back. For us this is serveral tens of TBs and serveral houndred millions of files so having the initial mirroring use the files that are already present would be a huge improvement.


We did all testing with 16.2.7, but as far as I can tell our observations still apply to 16.2.9 and Quincy.

Are there any plans for changes that would solve some of those issues listed above?

I would be happy to create tickets in Tracker in case that is helpful.

Best regards,

Andreas
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux