Re: OSD spend too much time on "waiting for readable" -> slow ops -> laggy pg -> rgw stop -> worst case osd restart

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Nice. Just now I building a 16.2.6 relese with this patch and will
test it. 

Thanks,
Manuel


On Thu, 18 Nov 2021 15:02:38 -0600
Sage Weil <sage@xxxxxxxxxxxx> wrote:

> Okay, good news: on the osd start side, I identified the bug (and easily
> reproduced locally).  The tracker and fix are:
> 
>  https://tracker.ceph.com/issues/53326
>  https://github.com/ceph/ceph/pull/44015
> 
> These will take a while to work through QA and get backported.
> 
> Also, to reiterate what I said on the call earlier today about the osd
> stopping issues:
>  - A key piece of the original problem you were seeing was because
> require_osd_release wasn't up to date, which meant that the the dead_epoch
> metadata wasn't encoded in the OSDMap and we would basically *always* go
> into the read lease wait when an OSD stopped.
>  - Now that that is fixed, it appears as though setting both
> osd_fast_shutdown *and* osd_fast_shutdown_notify_mon is the winning
> combination.
> 
> I would be curious to hear if adjusting the icmp throttle kernel setting
> makes things behave better when osd_fast_shutdown_notify_mon=false (the
> default), but this is more out of curiosity--I think we've concluded that
> we should set this option to true by default.
> 
> If I'm missing anything, please let me know!
> 
> Thanks for your patience in tracking this down.  It's always a bit tricky
> when there are multiple contributing factors (in this case, at least 3).
> 
> sage
> 
> 
> 

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux