Re: Proper solution of slow_ops

Milan Kupcevic <milan_kupcevic@xxxxxxxxxxx> · Thu, 11 Feb 2021 14:13:57 -0500

On 2/11/21 1:39 PM, Davor Cubranic wrote:
> But the config reference says “high” is already the default value?
> (https://docs.ceph.com/en/latest/rados/configuration/osd-config-ref/)
> 

It is not default in Nautilus. See
https://docs.ceph.com/en/nautilus/rados/configuration/osd-config-ref/?#operations

osd op queue cut off

Description

    This selects which priority ops will be sent to the strict queue
verses the normal queue. The low setting sends all replication ops and
higher to the strict queue, while the high option sends only replication
acknowledgement ops and higher to the strict queue. Setting this to high
should help when a few OSDs in the cluster are very busy especially when
combined with wpq in the osd op queue setting. OSDs that are very busy
handling replication traffic could starve primary client traffic on
these OSDs without these settings. Requires a restart.

Type
    String

Valid Choices
    low, high

Default
    low

> 
>> On Feb 9, 2021, at 4:42 AM, Milan Kupcevic <milan_kupcevic@xxxxxxxxxxx
>> <mailto:milan_kupcevic@xxxxxxxxxxx>> wrote:
>>
>> On 2/9/21 7:29 AM, Michal Strnad wrote:
>>>
>>> we are looking for a proper solution of slow_ops. When the disk failed,
>>> node is restated ... a lot of slow operations appear. Even if disk (OSD)
>>> or node is back again most of slow_ops are still there. On the internet
>>> we found only advice that we have to restart monitor. But this is not
>>> right approach. Do you have some better solution? How did you treat
>>> slow_ops in your production clusters?
>>>
>>> We are running the latest nautilus on all clusters.
>>>
>>
>>
>>
>> This config setting should help:
>>
>> ceph config set osd osd_op_queue_cut_off high
>>
>>

-- 
Milan Kupcevic
Senior Cyberinfrastructure Engineer at Project NESE
Harvard University
FAS Research Computing
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx