Re: MON sync time depends on outage duration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Eugen!

Yes that sounds familiar from the luminous and mimic days.

Check this old thread:
https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/F3W2HXMYNF52E7LPIQEJFUTAD3I7QE25/
(that thread is truncated but I can tell you that it worked for Frank).
Also the even older referenced thread:
https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/M5ZKF7PTEO2OGDDY5L74EV4QS5SDCZTH/

The workaround for zillions of snapshot keys at that time was to use:
   ceph config set mon mon_sync_max_payload_size 4096

That said, that sync issue was supposed to be fixed by way of adding the
new option mon_sync_max_payload_keys, which has been around since nautilus.

So it could be in your case that the sync payload is just too small to
efficiently move 42 million osd_snap keys? Using debug_paxos and debug_mon
you should be able to understand what is taking so long, and tune
mon_sync_max_payload_size and mon_sync_max_payload_keys accordingly.

Good luck!

Dan

______________________________________________________
Clyso GmbH | Ceph Support and Consulting | https://www.clyso.com



On Thu, Jul 6, 2023 at 1:47 PM Eugen Block <eblock@xxxxxx> wrote:

> Hi *,
>
> I'm investigating an interesting issue on two customer clusters (used
> for mirroring) I've not solved yet, but today we finally made some
> progress. Maybe someone has an idea where to look next, I'd appreciate
> any hints or comments.
> These are two (latest) Octopus clusters, main usage currently is RBD
> mirroring with snapshot mode (around 500 RBD images are synced every
> 30 minutes). They noticed very long startup times of MON daemons after
> reboot, times between 10 and 30 minutes (reboot time already
> subtracted). These delays are present on both sites. Today we got a
> maintenance window and started to check in more detail by just
> restarting the MON service (joins quorum within seconds), then
> stopping the MON service and wait a few minutes (still joins quorum
> within seconds). And then we stopped the service and waited for more
> than 5 minutes, simulating a reboot, and then we were able to
> reproduce it. The sync then takes around 15 minutes, we verified with
> other MONs as well. The MON store is around 2 GB of size (on HDD), I
> understand that the sync itself can take some time, but what is the
> threshold here? I tried to find a hint in the MON config, searching
> for timeouts with 300 seconds, there were only a few matches
> (mon_session_timeout is one of them), but I'm not sure if they can
> explain this behavior.
> Investigating the MON store (ceph-monstore-tool dump-keys) I noticed
> that there were more than 42 Million osd_snap keys, which is quite a
> lot and would explain the size of the MON store. But I'm also not sure
> if it's related to the long syncing process.
> Does that sound familiar to anyone?
>
> Thanks,
> Eugen
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux