Re: OSD spend too much time on "waiting for readable" -> slow ops -> laggy pg -> rgw stop -> worst case osd restart

Sage Weil <sage@xxxxxxxxxxxx> · Fri, 5 Nov 2021 18:33:58 -0500

Yeah, I think two different things are going on here.

The read leases were new, and I think the way that OSDs are marked down is
the key things that affects that behavior. I'm a bit surprised that the
_notify_mon option helps there, and will take a closer look at that Monday
to make sure it's doing what it's supposed to be doing.

The paxos_propose_interval is an upper bound on how long the monitor is
allowed to batch updates before committing them.  Many/most changes are
committed immediately, but the osdmap management tries to batch things up
so that a single osdmap epoch combines lots of changes when they are
happening quickly (there tends to be mini storms up dates when cluster
changes happen).  The default of 2s might be too much for many
environments, though... and we might consider changing the default to
something smaller (maybe more like 250ms).

sage

On Fri, Nov 5, 2021 at 8:40 AM Manuel Lausch <manuel.lausch@xxxxxxxx> wrote:

> Maybe this was me in an earlier mail
> It started at the point all replica partners are on octopus.
>
> This makes sense if I look at this code snippet:
>
>   if (!HAVE_FEATURE(recovery_state.get_min_upacting_features(),
>                     SERVER_OCTOPUS)) {
>     dout(20) << __func__ << " not all upacting has SERVER_OCTOPUS" <<
> dendl; return true; }
>
> ->
> https://github.com/ceph/ceph/blob/v15.2.12/src/osd/PrimaryLogPG.cc#L772-L775
>
>
> On Fri, 5 Nov 2021 14:20:00 +0100
> Peter Lieven <pl@xxxxxxx> wrote:
> >
> > I remember that someone wrote earlier that the issues while upgrading
> > from Nautilus to Octopus started at the point where the osd compat
> > level is set to octopus.
> >
> > So one of my initial guesses back when I tried to analyze this issue
> > was that it has something to do with the new "read from all osds not
> > just the primary" feature.
> >
> > Makes that sense?
> >
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx