Re: Slow Operations observed on one OSD (dedicated for RGW indexes), caused by problematic Placement Group

Casey Bodley <cbodley@xxxxxxxxxx> · Thu, 9 Jan 2020 12:40:45 -0500

On 1/9/20 10:33 AM, P. O. wrote:
Hi All,

I am observing strange behaviour on one OSD, which is dedicated for 
RGW index (pool: default.rgw.buckets.index).
My ceph is: ceph version 14.2.4 
(75f4de193b3ea58512f204623e6c5a16e6c1e1ba) nautilus (stable)

In log files are many messages like below:
2020-01-08 10:37:53.054 7f1fa22dd700  0 
bluestore(/var/lib/ceph/osd/ceph-62) log_latency slow operation 
observed for next, latency = 22.4641s
...
2020-01-08 14:57:06.987 7f1fa22dd700  0 
bluestore(/var/lib/ceph/osd/ceph-62) log_latency_fn slow operation 
observed for upper_bound, latency = 28.9336s, after = �0_ 
omap_iterator(cid = 230.2c_head, oid = 
#230:36f53ff9:::.dir.12307ed8-2342-4293-9f14-12234220a234.13221775.224244.31:head#)
2020-01-08 15:17:39.972 7f1fa22dd700  0 
bluestore(/var/lib/ceph/osd/ceph-62) log_latency_fn slow operation 
observed for upper_bound, latency = 23.9168s, after = �0_ 
omap_iterator(cid = 230.2c_head, oid = 
#230:36f53ff9:::.dir.12307ed8-2342-4293-9f14-12234220a234.13221775.224244.31:head#)
2020-01-08 15:37:44.221 7f1fa62e5700  0 
bluestore(/var/lib/ceph/osd/ceph-62) log_latency_fn slow operation 
observed for upper_bound, latency = 25.8316s, after = �0_ 
omap_iterator(cid = 230.2c_head, oid = 
#230:36f53ff9:::.dir.12307ed8-2342-4293-9f14-12234220a234.13221775.224244.31:head#)
2020-01-08 15:58:12.279 7f1fa62e5700  0 
bluestore(/var/lib/ceph/osd/ceph-62) log_latency_fn slow operation 
observed for upper_bound, latency = 27.8207s, after = �0_ 
omap_iterator(cid = 230.2c_head, oid = 
#230:36f53ff9:::.dir.12307ed8-2342-4293-9f14-12234220a234.13221775.224244.31:head#)

I discovered that these times are associated with GC bilogs (it runs 
every 30 minutes).

What led you to that conclusion? GC and bucket index logs are different 
things. The bucket index log is only used for multisite replication and, 
given your pool name 'default.rgw.buckets.index', it looks like you're 
just running the single default zone. Is that correct?

# /etc/ceph/ceph.conf
rgw sync log trim interval = 1200

ID of bucket bucket_1: 
12307ed8-2342-4293-9f14-12234220a234.13221775.224244
Shard nr. 31 of bucket (bucket_1) index: 
.dir.12307ed8-2342-4293-9f14-12234220a234.13221775.224244.31

[root@mon-1 ~]# ceph osd map default.rgw.buckets.index 
.dir.12307ed8-2342-4293-9f14-12234220a234.13221775.224244.31
osdmap e23417 pool 'default.rgw.buckets.index' (230) object 
'.dir.12307ed8-2342-4293-9f14-12234220a234.13221775.224244.31' -> pg 
230.9ffcaf6c (230.2c) -> up ([62,56,60], p62) acting ([62,56,60], p62)

OSD: 62,56,60
Problematic OSD: 62
Problematic PG: 230.2c

digging deeper..

I checked that when I run the command:
[root@mon-1 ~]# radosgw-admin bilog list --bucket bucket_1
[]

Even when there are no bilogs, there are the same symptoms:

1) In logs:
2020-01-08 16:31:24.728 7f1fa22dd700  0 
bluestore(/var/lib/ceph/osd/ceph-62) log_latency_fn slow operation 
observed for upper_bound, latency = 28.653s, after = �0_ 
omap_iterator(cid = 230.2c_head, oid = 
#230:36f53ff9:::.dir.12307ed8-2342-4293-9f14-12234220a234.13221775.224244.31:head#)

2) Any other radosgw-admin commands are very slow while other process 
is iterating through bilogs (of this bucket).

[dc-2 root@mon-1 ~]# time radosgw-admin bucket stats --bucket=bucket234
...
real 0m25.641s
user 0m0.148s
sys 0m0.075s

When i try to run deep scrub on this pg 230.2c, in logs appears:

[root@mon-1 ~]# ceph pg deep-scrub 230.2c

2020-01-08 16:38:32.551 7f1fc50f0700  1 heartbeat_map is_healthy 
'OSD::osd_op_tp thread 0x7f1fa22dd700' had timed out after 15
...
2020-01-08 16:38:46.735 7f1fc60f2700  1 heartbeat_map is_healthy 
'OSD::osd_op_tp thread 0x7f1fa62e5700' had timed out after 15
2020-01-08 16:38:46.735 7f1fc60f2700  1 heartbeat_map is_healthy 
'OSD::osd_op_tp thread 0x7f1fa22dd700' had timed out after 15
2020-01-08 16:38:46.735 7f1fc60f2700  1 heartbeat_map is_healthy 
'OSD::osd_op_tp thread 0x7f1fa62e5700' had timed out after 15
2020-01-08 16:38:46.911 7f1fa22dd700  0 
bluestore(/var/lib/ceph/osd/ceph-62) log_latency_fn slow operation 
observed for upper_bound, latency = 29.8655s, after = <80>0_ 
omap_iterator(cid = 230.2c_head, oid = 
#230:36f53ff9:::.dir.12307ed8-2342-4293-9f14-12234220a234.13221775.224244.31:head#)
2020-01-08 16:38:46.911 7f1fa22dd700  1 heartbeat_map reset_timeout 
'OSD::osd_op_tp thread 0x7f1fa22dd700' had timed out after 15
2020-01-08 16:38:46.911 7f1fa62e5700  1 heartbeat_map reset_timeout 
'OSD::osd_op_tp thread 0x7f1fa62e5700' had timed out after 15

2020-01-08 10:37:53.054 7f1fa22dd700  0 
bluestore(/var/lib/ceph/osd/ceph-62) log_latency slow operation 
observed for next, latency = 22.4641s

When i try to iterate over listomapkeys for: 
.dir.12307ed8-2342-4293-9f14-12234220a234.13221775.224244.31
[root@mon-1 ~]# rados -p default.rgw.buckets.index listomapkeys 
.dir.12307ed8-2342-4293-9f14-12234220a234.13221775.224244.31
In logs i observe:

2020-01-09 11:08:58.847 7f1fa62e5700  0 
bluestore(/var/lib/ceph/osd/ceph-62) log_latency slow operation 
observed for next, latency = 24.1949s

The same for listomapvals

[root@mon-1 ~]# rados -p default.rgw.buckets.index listomapvals 
.dir.12307ed8-2342-4293-9f14-12234220a234.13221775.224244.31

2020-01-09 11:11:36.508 7f1fa62e5700  0 
bluestore(/var/lib/ceph/osd/ceph-62) log_latency slow operation 
observed for next, latency = 28.2137s

I tried to purge and recreate this OSD, also I trimmed bilogs for this 
bucket (radosgw-admin bilog trim --bucket bucket_1), but this didn't help.

I suspect that any operations on PG: 230.2c on OSD: 62 can causing 
slow operations on my cluster.
I have no idea how to debug/resolve this problem. Any help/hint is 
appreciated!

Thanks,
Piotr

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx