Am 02.11.21 um 15:02 schrieb Sage Weil:
On Tue, Nov 2, 2021 at 8:29 AM Manuel Lausch <manuel.lausch@xxxxxxxx> wrote:
Hi Sage,
The "osd_fast_shutdown" is set to "false"
As we upgraded to luminous I also had blocked IO issuses with this
enabled.
Some weeks ago I tried out the options "osd_fast_shutdown" and
"osd_fast_shutdown_notify_mon" and also got slow ops while
stopping/starting OSDs. But I didn't ceck if this triggert the
problem with the read_leases or if this triggert my old issue
with the fast shutodnw.
Just to be clear, you should try
osd_fast_shutdown = true
osd_fast_shutdown_notify_mon = false
You write if the osd rejects messenger connections, because it is
stopped, the peering process will skip the read_lease timeout. If the
OSD annouces its shutdown, can we not skip this read_lease timeout as
well?
If memory serves, yes, but the notify_mon process can take more time than a
peer OSD getting ECONNREFUSED. The combination above is the recommended
combation (and the default).
When we fast this issue we had a fresh Octopus install with default values...
If necessary I can upgrade our development cluster to Octopus again and also
run some tests.
Best,
Peter
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx