Dear Ceph developers and users, We are using ceph version 17.2.1 (ec95624474b1871a821a912b8c3af68f8f8e7aa1) quincy (stable). We are using cephadm since version 15 octopus. We mirror several CephFS directories from our main cluster our to a second mirror cluster. In particular with bigger directories (over 900 TB and 186 M of files), we noticed that mirroring is very slow. On the mirror, most of the time we only observe a write speed of 0 to 10 MB/s in the client IO. The target peer directory often doesn't show increase in size during syncronization (when we check with: getfattr -n ceph.dir.rbytes). The status of the syncs is always fine, i.e. syncing and not failing: 0|0[root@osd-1 /var/run/ceph/55633ec3-6c0c-4a02-990c-0f87e0f7a01f]# ceph --admin-daemon ceph-client.cephfs-mirror.osd-1.ydsqsw.7.94552861013544.asok fs mirror peer status cephfs@1 c66afb80-593f-4c42-a120-dd3b6fca26bc { "/irods/sodar": { "state": "syncing", "current_sycning_snap": { "id": 7552, "name": "scheduled-2022-08-22-13_00_00" }, "last_synced_snap": { "id": 7548, "name": "scheduled-2022-08-22-12_00_00", "sync_duration": 37828.164744490001, "sync_time_stamp": "13240678.542916s" }, "snaps_synced": 1, "snaps_deleted": 11, "snaps_renamed": 0 } } The cluster nodes (6 per cluster) are connected with Dual 40G NICs to the switches. Connection between switches are 2x 100G. Simple write operations from other clients to the mirror cephfs result in writes of e.g. 300 to 400 MB/s. So network doesn't seem to be the issue, here. We started to dig into debug logs of the cephfs-mirror daemon / docker container. We set the debug level to 20. Otherwise there are no messages at all (so no errors). We observed a lot of messages with "need_data_sync=0, need_attr_sync=1". Leading us to the assumption, that instead of actual data a lot of attributes are synced. We started looking at specific examples in the logsband tried to make sence from the source code which steps are happening. Most of the messages are coming from cephfs::mirror::PeerReplayer https://github.com/ceph/ceph/blob/6fee777d603aebce492c57b41f3b5760d50ddb07/src/tools/cephfs_mirror/PeerReplayer.cc We figured, the do_synchronize function checks if data (need_data_sync) or attributes (need_attr_sync) should be synchronized using should_sync_entry. And if necessary performs the sync using remote_file_op. should_sync_entry reports different ctimes for our examples, e.g.: local cur statx: mode=33152, uid=996, gid=993, size=154701172, ctime=2022-01-28T12:54:21.176004+0000, ... local prev statx: mode=33152, uid=996, gid=993, size=154701172, ctime=2022-08-22T11:03:18.578380+0000, ... Based on these different ctimes, should_sync_entry decides then that attributes need to be synced: *need_attr_sync = (cstx.stx_ctime != pstx.stx_ctime) https://github.com/ceph/ceph/blob/6fee777d603aebce492c57b41f3b5760d50ddb07/src/tools/cephfs_mirror/PeerReplayer.cc#L911 We assume cur statx/cstx refers to the file in the snapshot currently mirrored. But what exactly is prev statx/pstx? Is it the peer path or the last snapshot on the mirror peer? We can confirm that ctimes are different on the main cluster and the mirror. On the main cluster, the ctimes are consistent in every snapshot (since the files didn't change). On the the mirror, the ctimes increase with every snapshot towards more current dates. Given that the CephFS Mirror daemon writes the data to the mirror as a CephFS client, it seems to make sense that data on the mirror has different / more recent ctimes (from writing). Also, when the mirror daemon is syncing the attributes to the mirror, wouldn't this trigger an new/current ctime as well? So our assumption is, syncing an old ctime will actually result in a new ctime. And thus trigger the sync of attributes over and over (at least with every snapshot synced). So is ctime the proper parameter to test if attributes need to be synced? Or shouldn't it rather be excluded? So is this check the right thing to do: *need_attr_sync = (cstx.stx_ctime != pstx.stx_ctime) Is it reasonable to assume that these attribute syncs are responsible for our slow mirroring? Or is there anything else we should look out for? And are there actually commands or logs showing us the speed of the mirroring? We only now about sync_duration and sync_time_stamp (as in the status above). But then, how can we actually determine the size of a snapshot or the difference between snapshots? So one can make speed calculations for the latest sync. What is your general experience with mirroring performance? In particular with bigger cephfs directories towards peta bytes. Mirroring (backing up) our data is a really crucial issue for us (and certainly many others). So we are lookin forward for you input. Thanks a lot in advance. Best Wishes, Mathias _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx