Hi Sage, thank you for your help. My origin issue with slow ops on osd restarts are gone too. Even with default values for paxos_proposal_interval. Its a bit annoying, that I spent many hours to debug this and finally I missed only one step in the upgrade. Only during the update itself, until require_osd_release is set to the new version, there will be interruptions Regards Manuel ________________________________ Von: Sage Weil <sage@xxxxxxxxxxxx> Gesendet: Dienstag, 9. November 2021 17:29 An: Manuel Lausch Betreff: Re: Re: OSD spend too much time on "waiting for readable" -> slow ops -> laggy pg -> rgw stop -> worst case osd restart Yeah, I think that is the problem. The field that is getting updated by prepare_beacon is new in octopus, so if your osdmap still has require_osd_release=nautlius then it is trying to set it but then not getting encoded (for compatibility). Doing `ceph osd require_osd_release octopus` should resolve this. On Tue, Nov 9, 2021 at 9:01 AM Sage Weil <sage@xxxxxxxxxxxx<mailto:sage@xxxxxxxxxxxx>> wrote: What version are you running? I thought it was pacific or octopus but the osdmap says "require_osd_release": "nautilus" which implies the upgrade procedure wasn't finished? sage On Tue, Nov 9, 2021 at 8:08 AM Manuel Lausch <manuel.lausch@xxxxxxxx<mailto:manuel.lausch@xxxxxxxx>> wrote: As far as I see, the maps differ only in the epoch and creation date. Nothing else. I dumped some maps and uploaded it for you: 1f1e1e5e-1c1c-470b-b691-ed820687bab8 On This cluster I don't create snapshots regularly. Since some weeks, there are no snapshots present. please let me know, if you need further information. Regards Manuel On Tue, 9 Nov 2021 07:40:29 -0600 Sage Weil <sage@xxxxxxxxxxxx<mailto:sage@xxxxxxxxxxxx>> wrote: > Are you sure consecutive maps are identical? Can you get the latest > epoch ('ceph osd stat'), and then dump a few consecutive ones? e.g. > > ceph osd dump 1000 -f json-pretty > 1000 > ceph osd dump 1001 -f json-pretty > 1001 > ceph osd dump 1002 -f json-pretty > 1002 > ceph osd dump 1003 -f json-pretty > 1003 > > ...and ceph-post-file those? Based on the logs I think the delta is > related to snap trimming, but want to confirm. Thanks! > > Thanks! > sage > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx