Re: osd fast shutdown provokes slow requests

Dan van der Ster <dan@xxxxxxxxxxxxxx> · Fri, 14 Aug 2020 14:02:30 +0200

Hi,

I suppose the idea is that it's quicker to fail via the connection
refused setting than by waiting for an osdmap to be propagated across
the cluster.

It looks simple enough in OSD.cc to also send the down message to the
mon even with fast shutdown enabled. But I don't have any clue if that
would cause other issues.

-- Dan

On Fri, Aug 14, 2020 at 1:51 PM Manuel Lausch <manuel.lausch@xxxxxxxx> wrote:
>
> Hi Dan,
>
> thank you for the link. I read it as well as the linked conversation in
> the rook project.
>
> I don't get it why the fast shutdown should be better than the "normal"
> shutdown in which the OSD annouces its shutdown directly.
>
> Are there cases where the shutdown of the OSD takes longer until its
> markdown message is sent?
> Travis on the rook project mentioned the shutodwn issued IO
> interruptions from 20 to 30 seconds.
> (https://github.com/rook/rook/pull/4328#issuecomment-554480275)
> This is something I would expect if a complet host break down (e.g.
> caused by a hw failure) and only the heartbeat timeouts detect this. But
> not on a regular shutdown.
>
>
>
>
> Manuel
>
> On Fri, 14 Aug 2020 11:03:37 +0200
> Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote:
>
> > There's a bit of discussion on this at the original PR:
> > https://github.com/ceph/ceph/pull/31677
> > Sage claims the IO interruption should be smaller with
> > osd_fast_shutdown than without.
> >
> > -- dan
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx