Hello Istvan, the state "waiting for readable" seems to be related to this read_lease topic documented here: https://docs.ceph.com/en/latest/dev/osd_internals/stale_read/ The only parameter to tune around this I know about, is the "osd_pool_default_read_lease_ratio" which is defaulted 0.8 My clusters complains after 5 seconds for slow ops due to corresponding configuration. On starting, stopping or restarting OSDs I see slow ops up to about 15 seconds. In normal clusters this will not be shown as slow op because this is under the default 32 seconds limit. We plan to set this parameter to 0.15 to mitigate the issue before we will upgrade to ceph octopus and later. I also where happy if someone with deep knowledge could have a look to this issue. To your cluster. I don't see the reasen whats triggers your cluster to run in this stage. Did you stop/start some OSDs beforhand? You May tell us some more details. Manuel On Thu, 28 Oct 2021 11:19:15 +0000 "Szabo, Istvan (Agoda)" <Istvan.Szabo@xxxxxxxxx> wrote: > Hi, > > I found on the mail thread couple of emails that might be related to > this issue, but there it came after osd restart or recovery. My > cluster is on octopus 15.2.14 at the moment. > > When slow ops started to come I can see laggy pgs and slowly the 3 > rados gateway starts to die behind the haproxy and when the slow ops > gone it will restart, but it causes user interruption. > > Digging deeper on the affected osd slow_ops historical log I can see > couple of slow ops, where 32 seconds spent on waiting for readable > event. Is there anything to do with this? To avoid osd restart I'm > running with osd_op_thread_suicide_timeout=2000 > osd_op_thread_timeout=90 enabled on all osd at the moment. > > Might be I'm using too small amount of pg? > Data pool is on 4:2 ec host based and have 7 nodes, currently the pg > number is 128 based on the balancer suggestion. > > An example slow ops: > > { > "description": "osd_op(client.141400841.0:290235021 > 28.17s0 > 28:e94c28ab:::9213182a-14ba-48ad-bde9-289a1c0c0de8.6034919.1_%2fWHITELABEL-1%2fPAGETPYE-5%2fDEVICE-4%2fLANGUAGE-38%2fSUBTYPE-0%2f148341:head > [create,setxattr user.rgw.idtag (56) in=14b,setxattr > user.rgw.tail_tag (56) in=17b,writefull 0~17706,setxattr > user.rgw.manifest (375) in=17b,setxattr user.rgw.acl (123) > in=12b,setxattr user.rgw.content_type (10) in=21b,setxattr > user.rgw.etag (32) in=13b,setxattr > user.rgw.x-amz-meta-storagetimestamp (40) in=36b,call > rgw.obj_store_pg_ver in=19b,setxattr user.rgw.source_zone (4) in=20b] > snapc 0=[] ondisk+write+known_if_redirected e37602)", "initiated_at": > "2021-10-28T16:52:44.652426+0700", "age": 3258.4773889160001, > "duration": 32.445993113, "type_data": { "flag_point": "started", > "client_info": { "client": "client.141400841", "client_addr": > "10.118.199.3:0/462844935", "tid": 290235021 }, "events": > [ { "event": "initiated", "time": "2021-10-28T16:52:44.652426+0700", > "duration": 0 }, > { > "event": "throttled", > "time": "2021-10-28T16:52:44.652426+0700", > "duration": 6.1143999999999996e-05 > }, > { > "event": "header_read", > "time": "2021-10-28T16:52:44.652487+0700", > "duration": 2.2340000000000001e-06 > }, > { > "event": "all_read", > "time": "2021-10-28T16:52:44.652489+0700", > "duration": 1.1519999999999999e-06 > }, > { > "event": "dispatched", > "time": "2021-10-28T16:52:44.652490+0700", > "duration": 3.146e-06 > }, > { > "event": "queued_for_pg", > "time": "2021-10-28T16:52:44.652493+0700", > "duration": 4.0936000000000002e-05 > }, > { > "event": "reached_pg", > "time": "2021-10-28T16:52:44.652534+0700", > "duration": 1.9236e-05 > }, > { > "event": "waiting for readable", > "time": "2021-10-28T16:52:44.652554+0700", > "duration": 32.430086821000003 > }, > { > "event": "reached_pg", > "time": "2021-10-28T16:53:17.082640+0700", > "duration": 0.00010452500000000001 > }, > { > "event": "started", > "time": "2021-10-28T16:53:17.082745+0700", > "duration": 0.015673919000000001 > }, > { > "event": "done", > "time": "2021-10-28T16:53:17.098419+0700", > "duration": 32.445993113 > } > ] > } > } > > Thanks in advance. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx