Re: CephFS Snapshot Mirroring slow due to repeating attribute sync

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Aug 23, 2022 at 10:01 PM Kuhring, Mathias
<mathias.kuhring@xxxxxxxxxxxxxx> wrote:
>
> Dear Ceph developers and users,
>
> We are using ceph version 17.2.1
> (ec95624474b1871a821a912b8c3af68f8f8e7aa1) quincy (stable).
> We are using cephadm since version 15 octopus.
>
> We mirror several CephFS directories from our main cluster our to a
> second mirror cluster.
> In particular with bigger directories (over 900 TB and 186 M of files),
> we noticed that mirroring is very slow.
> On the mirror, most of the time we only observe a write speed of 0 to 10
> MB/s in the client IO.
> The target peer directory often doesn't show increase in size during
> syncronization
> (when we check with: getfattr -n ceph.dir.rbytes).
>
> The status of the syncs is always fine, i.e. syncing and not failing:
>
> 0|0[root@osd-1 /var/run/ceph/55633ec3-6c0c-4a02-990c-0f87e0f7a01f]# ceph
> --admin-daemon
> ceph-client.cephfs-mirror.osd-1.ydsqsw.7.94552861013544.asok fs mirror
> peer status cephfs@1 c66afb80-593f-4c42-a120-dd3b6fca26bc
> {
>      "/irods/sodar": {
>          "state": "syncing",
>          "current_sycning_snap": {
>              "id": 7552,
>              "name": "scheduled-2022-08-22-13_00_00"
>          },
>          "last_synced_snap": {
>              "id": 7548,
>              "name": "scheduled-2022-08-22-12_00_00",
>              "sync_duration": 37828.164744490001,
>              "sync_time_stamp": "13240678.542916s"
>          },
>          "snaps_synced": 1,
>          "snaps_deleted": 11,
>          "snaps_renamed": 0
>      }
> }
>
> The cluster nodes (6 per cluster) are connected with Dual 40G NICs to
> the switches.
> Connection between switches are 2x 100G.
> Simple write operations from other clients to the mirror cephfs result
> in writes of e.g. 300 to 400 MB/s.
> So network doesn't seem to be the issue, here.
>
> We started to dig into debug logs of the cephfs-mirror daemon / docker
> container.
> We set the debug level to 20. Otherwise there are no messages at all (so
> no errors).
>
> We observed a lot of messages with "need_data_sync=0, need_attr_sync=1".
> Leading us to the assumption, that instead of actual data a lot of
> attributes are synced.
>
> We started looking at specific examples in the logsband tried to make
> sence from the source code which steps are happening.
> Most of the messages are coming from cephfs::mirror::PeerReplayer
> https://github.com/ceph/ceph/blob/6fee777d603aebce492c57b41f3b5760d50ddb07/src/tools/cephfs_mirror/PeerReplayer.cc
>
> We figured, the do_synchronize function checks if data (need_data_sync)
> or attributes (need_attr_sync) should be synchronized using
> should_sync_entry.
> And if necessary performs the sync using remote_file_op.
>
> should_sync_entry reports different ctimes for our examples, e.g.:
> local cur statx: mode=33152, uid=996, gid=993, size=154701172,
> ctime=2022-01-28T12:54:21.176004+0000, ...
> local prev statx: mode=33152, uid=996, gid=993, size=154701172,
> ctime=2022-08-22T11:03:18.578380+0000, ...
>
> Based on these different ctimes, should_sync_entry decides then that
> attributes need to be synced:
> *need_attr_sync = (cstx.stx_ctime != pstx.stx_ctime)
> https://github.com/ceph/ceph/blob/6fee777d603aebce492c57b41f3b5760d50ddb07/src/tools/cephfs_mirror/PeerReplayer.cc#L911
>
> We assume cur statx/cstx refers to the file in the snapshot currently
> mirrored.
> But what exactly is prev statx/pstx? Is it the peer path or the last
> snapshot on the mirror peer?
>
> We can confirm that ctimes are different on the main cluster and the mirror.
> On the main cluster, the ctimes are consistent in every snapshot (since
> the files didn't change).
> On the the mirror, the ctimes increase with every snapshot towards more
> current dates.
>
> Given that the CephFS Mirror daemon writes the data to the mirror as a
> CephFS client,
> it seems to make sense that data on the mirror has different / more
> recent ctimes (from writing).
> Also, when the mirror daemon is syncing the attributes to the mirror,
> wouldn't this trigger an new/current ctime as well?
> So our assumption is, syncing an old ctime will actually result in a new
> ctime.
> And thus trigger the sync of attributes over and over (at least with
> every snapshot synced).
>
> So is ctime the proper parameter to test if attributes need to be synced?
> Or shouldn't it rather be excluded?
> So is this check the right thing to do: *need_attr_sync =
> (cstx.stx_ctime != pstx.stx_ctime)
>
> Is it reasonable to assume that these attribute syncs are responsible
> for our slow mirroring?
> Or is there anything else we should look out for?
>
> And are there actually commands or logs showing us the speed of the
> mirroring?
> We only now about sync_duration and sync_time_stamp (as in the status
> above).
> But then, how can we actually determine the size of a snapshot or the
> difference between snapshots?
> So one can make speed calculations for the latest sync.
>
> What is your general experience with mirroring performance?
> In particular with bigger cephfs directories towards peta bytes.
>
> Mirroring (backing up) our data is a really crucial issue for us (and
> certainly many others).
> So we are lookin forward for you input. Thanks a lot in advance.

I see you created https://tracker.ceph.com/issues/58058. I'll follow
up with you on the tracker. Thanks!

>
> Best Wishes,
> Mathias
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx



-- 
Cheers,
Venky

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux