Hi Eugen! Yes that sounds familiar from the luminous and mimic days. Check this old thread: https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/F3W2HXMYNF52E7LPIQEJFUTAD3I7QE25/ (that thread is truncated but I can tell you that it worked for Frank). Also the even older referenced thread: https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/M5ZKF7PTEO2OGDDY5L74EV4QS5SDCZTH/ The workaround for zillions of snapshot keys at that time was to use: ceph config set mon mon_sync_max_payload_size 4096 That said, that sync issue was supposed to be fixed by way of adding the new option mon_sync_max_payload_keys, which has been around since nautilus. So it could be in your case that the sync payload is just too small to efficiently move 42 million osd_snap keys? Using debug_paxos and debug_mon you should be able to understand what is taking so long, and tune mon_sync_max_payload_size and mon_sync_max_payload_keys accordingly. Good luck! Dan ______________________________________________________ Clyso GmbH | Ceph Support and Consulting | https://www.clyso.com On Thu, Jul 6, 2023 at 1:47 PM Eugen Block <eblock@xxxxxx> wrote: > Hi *, > > I'm investigating an interesting issue on two customer clusters (used > for mirroring) I've not solved yet, but today we finally made some > progress. Maybe someone has an idea where to look next, I'd appreciate > any hints or comments. > These are two (latest) Octopus clusters, main usage currently is RBD > mirroring with snapshot mode (around 500 RBD images are synced every > 30 minutes). They noticed very long startup times of MON daemons after > reboot, times between 10 and 30 minutes (reboot time already > subtracted). These delays are present on both sites. Today we got a > maintenance window and started to check in more detail by just > restarting the MON service (joins quorum within seconds), then > stopping the MON service and wait a few minutes (still joins quorum > within seconds). And then we stopped the service and waited for more > than 5 minutes, simulating a reboot, and then we were able to > reproduce it. The sync then takes around 15 minutes, we verified with > other MONs as well. The MON store is around 2 GB of size (on HDD), I > understand that the sync itself can take some time, but what is the > threshold here? I tried to find a hint in the MON config, searching > for timeouts with 300 seconds, there were only a few matches > (mon_session_timeout is one of them), but I'm not sure if they can > explain this behavior. > Investigating the MON store (ceph-monstore-tool dump-keys) I noticed > that there were more than 42 Million osd_snap keys, which is quite a > lot and would explain the size of the MON store. But I'm also not sure > if it's related to the long syncing process. > Does that sound familiar to anyone? > > Thanks, > Eugen > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx