Hi Mathias, (apologies for the super late reply - I was getting back from a long vacation and missed seeing this). I updated the tracker ticket. Let's move the discussion there... On Mon, Nov 28, 2022 at 7:46 PM Venky Shankar <vshankar@xxxxxxxxxx> wrote: > > On Tue, Aug 23, 2022 at 10:01 PM Kuhring, Mathias > <mathias.kuhring@xxxxxxxxxxxxxx> wrote: > > > > Dear Ceph developers and users, > > > > We are using ceph version 17.2.1 > > (ec95624474b1871a821a912b8c3af68f8f8e7aa1) quincy (stable). > > We are using cephadm since version 15 octopus. > > > > We mirror several CephFS directories from our main cluster our to a > > second mirror cluster. > > In particular with bigger directories (over 900 TB and 186 M of files), > > we noticed that mirroring is very slow. > > On the mirror, most of the time we only observe a write speed of 0 to 10 > > MB/s in the client IO. > > The target peer directory often doesn't show increase in size during > > syncronization > > (when we check with: getfattr -n ceph.dir.rbytes). > > > > The status of the syncs is always fine, i.e. syncing and not failing: > > > > 0|0[root@osd-1 /var/run/ceph/55633ec3-6c0c-4a02-990c-0f87e0f7a01f]# ceph > > --admin-daemon > > ceph-client.cephfs-mirror.osd-1.ydsqsw.7.94552861013544.asok fs mirror > > peer status cephfs@1 c66afb80-593f-4c42-a120-dd3b6fca26bc > > { > > "/irods/sodar": { > > "state": "syncing", > > "current_sycning_snap": { > > "id": 7552, > > "name": "scheduled-2022-08-22-13_00_00" > > }, > > "last_synced_snap": { > > "id": 7548, > > "name": "scheduled-2022-08-22-12_00_00", > > "sync_duration": 37828.164744490001, > > "sync_time_stamp": "13240678.542916s" > > }, > > "snaps_synced": 1, > > "snaps_deleted": 11, > > "snaps_renamed": 0 > > } > > } > > > > The cluster nodes (6 per cluster) are connected with Dual 40G NICs to > > the switches. > > Connection between switches are 2x 100G. > > Simple write operations from other clients to the mirror cephfs result > > in writes of e.g. 300 to 400 MB/s. > > So network doesn't seem to be the issue, here. > > > > We started to dig into debug logs of the cephfs-mirror daemon / docker > > container. > > We set the debug level to 20. Otherwise there are no messages at all (so > > no errors). > > > > We observed a lot of messages with "need_data_sync=0, need_attr_sync=1". > > Leading us to the assumption, that instead of actual data a lot of > > attributes are synced. > > > > We started looking at specific examples in the logsband tried to make > > sence from the source code which steps are happening. > > Most of the messages are coming from cephfs::mirror::PeerReplayer > > https://github.com/ceph/ceph/blob/6fee777d603aebce492c57b41f3b5760d50ddb07/src/tools/cephfs_mirror/PeerReplayer.cc > > > > We figured, the do_synchronize function checks if data (need_data_sync) > > or attributes (need_attr_sync) should be synchronized using > > should_sync_entry. > > And if necessary performs the sync using remote_file_op. > > > > should_sync_entry reports different ctimes for our examples, e.g.: > > local cur statx: mode=33152, uid=996, gid=993, size=154701172, > > ctime=2022-01-28T12:54:21.176004+0000, ... > > local prev statx: mode=33152, uid=996, gid=993, size=154701172, > > ctime=2022-08-22T11:03:18.578380+0000, ... > > > > Based on these different ctimes, should_sync_entry decides then that > > attributes need to be synced: > > *need_attr_sync = (cstx.stx_ctime != pstx.stx_ctime) > > https://github.com/ceph/ceph/blob/6fee777d603aebce492c57b41f3b5760d50ddb07/src/tools/cephfs_mirror/PeerReplayer.cc#L911 > > > > We assume cur statx/cstx refers to the file in the snapshot currently > > mirrored. > > But what exactly is prev statx/pstx? Is it the peer path or the last > > snapshot on the mirror peer? > > > > We can confirm that ctimes are different on the main cluster and the mirror. > > On the main cluster, the ctimes are consistent in every snapshot (since > > the files didn't change). > > On the the mirror, the ctimes increase with every snapshot towards more > > current dates. > > > > Given that the CephFS Mirror daemon writes the data to the mirror as a > > CephFS client, > > it seems to make sense that data on the mirror has different / more > > recent ctimes (from writing). > > Also, when the mirror daemon is syncing the attributes to the mirror, > > wouldn't this trigger an new/current ctime as well? > > So our assumption is, syncing an old ctime will actually result in a new > > ctime. > > And thus trigger the sync of attributes over and over (at least with > > every snapshot synced). > > > > So is ctime the proper parameter to test if attributes need to be synced? > > Or shouldn't it rather be excluded? > > So is this check the right thing to do: *need_attr_sync = > > (cstx.stx_ctime != pstx.stx_ctime) > > > > Is it reasonable to assume that these attribute syncs are responsible > > for our slow mirroring? > > Or is there anything else we should look out for? > > > > And are there actually commands or logs showing us the speed of the > > mirroring? > > We only now about sync_duration and sync_time_stamp (as in the status > > above). > > But then, how can we actually determine the size of a snapshot or the > > difference between snapshots? > > So one can make speed calculations for the latest sync. > > > > What is your general experience with mirroring performance? > > In particular with bigger cephfs directories towards peta bytes. > > > > Mirroring (backing up) our data is a really crucial issue for us (and > > certainly many others). > > So we are lookin forward for you input. Thanks a lot in advance. > > I see you created https://tracker.ceph.com/issues/58058. I'll follow > up with you on the tracker. Thanks! > > > > > Best Wishes, > > Mathias > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > > -- > Cheers, > Venky -- Cheers, Venky _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx