Re: slow ops at restarting OSDs (octopus)

Manuel Lausch <manuel.lausch@xxxxxxxx> · Fri, 11 Jun 2021 13:34:38 +0200

Okay, I poked around a bit more and found this document:
https://docs.ceph.com/en/latest/dev/osd_internals/stale_read/

I don't understand exactly what it is all about and how it works,
and what the intetion is behind it. But there is one config option
mentiond: "osd_pool_default_read_lease_ratio" This is defaulted to 0.8.
Multiplied with the osd_hearbeat_grace (which is default 20) it sets
that "read lease" to 16 seconds ?! 

I set this ratio to 0.2 which leads to 4 seconds lease time. With that,
the problem is solved. No more slow ops.

Until now, I thought that this is a problem on huge clusters. But with
this setting I assumed that this should be a issue with quite small
cluster as well. So I tested it with a 3 Node 12 OSD SSD Cluster on
octopus with the same issues.

I can't believe I am the first one, which have this problem.

Manuel

On Thu, 10 Jun 2021 17:45:02 +0200
Manuel Lausch <manuel.lausch@xxxxxxxx> wrote:

> Hi Peter,
> 
> your suggestion pointed me to the right spot. 
> I didn't know about the feature, that ceph will read from replica
> PGs.
> 
> So on. I found two functions in the osd/PrimaryLogPG.cc:
> "check_laggy" and "check_laggy_requeue". On both is first a check, if
> the partners have the octopus features. if not, the function is
> skipped. This explains the beginning of the problem after about the
> half cluster was updated.
> 
> To verifiy this, I added "return true" in the first line of the
> functions. The issue is gone with it. But
> I don't know what problems this could trigger. I know, the root cause
> is not fixed with it.
> I think I will open a bug ticket with this knowlage.
> 
> 
> osd_op_queue_cutoff is set to high
> and a icmp rate limiting should not happen
> 
> 
> Thanks
> Manuel
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx