Hi
I observed high latencies and mount points hanging since Octopus release
and it's still observed on Pacific latest while draining OSD.
Cluster setup:
Ceph Pacific 16.2.7
Cephfs with EC data pool
EC profile setup:
crush-device-class=
crush-failure-domain=host
crush-root=default
jerasure-per-chunk-alignment=false
k=10
m=2
plugin=jerasure
technique=reed_sol_van
w=8
Description:
If we have broken drive, we are removing it from Ceph cluster by
draining it first. That means changing its crush weight to 0
ceph osd crush reweight osd.1 0
Normally on Nautilus it didn't affected clients. But after upgrade to
Octopus (and since Octopus till current Pacific release) I can observe
very high IO latencies on clients while OSD being drained (10sec and
higher).
By debugging I found out that drained OSD is still listed as
ACTIVE_PRIMARY and that happens only on EC pools and only since Octopus.
I tested it back on Nautilus, to be sure, where behavior is correct and
drained OSD is not listed under UP and ACTIVE OSDs for PGs.
Even if setting up primary-affinity for given OSD to 0 this doesn't have
any effect on EC pool.
Bellow are my debugs:
Buggy behavior on Octopus and Pacific:
Before draining osd.70:
PG_STAT OBJECTS MISSING_ON_PRIMARY DEGRADED MISPLACED UNFOUND
BYTES OMAP_BYTES* OMAP_KEYS* LOG DISK_LOG
STATE STATE_STAMP VERSION
REPORTED UP UP_PRIMARY ACTING
ACTING_PRIMARY LAST_SCRUB SCRUB_STAMP LAST_DEEP_SCRUB
DEEP_SCRUB_STAMP SNAPTRIMQ_LEN
16.1fff 2269 0 0 0 0
8955297727 0 0 2449 2449
active+clean 2022-05-19T08:41:55.241734+0200 19403690'275685
19407588:19607199 [70,206,216,375,307,57] 70
[70,206,216,375,307,57] 70 19384365'275621
2022-05-19T08:41:55.241493+0200 19384365'275621
2022-05-19T08:41:55.241493+0200 0
dumped pgs
after setting osd.70 crush weight to 0 (osd.70 is still acting primary):
UP UP_PRIMARY ACTING
ACTING_PRIMARY LAST_SCRUB SCRUB_STAMP
LAST_DEEP_SCRUB DEEP_SCRUB_STAMP SNAPTRIMQ_LEN
16.1fff 2269 0 0 2269 0
8955297727 0 0 2449 2449
active+remapped+backfill_wait 2022-05-20T08:51:54.249071+0200
19403690'275685 19407668:19607289 [71,206,216,375,307,57] 71
[70,206,216,375,307,57] 70 19384365'275621
2022-05-19T08:41:55.241493+0200 19384365'275621
2022-05-19T08:41:55.241493+0200 0
dumped pgs
Correct behavior on Nautilus:
Before draining osd.10:
PG_STAT OBJECTS MISSING_ON_PRIMARY DEGRADED MISPLACED UNFOUND BYTES
OMAP_BYTES* OMAP_KEYS* LOG DISK_LOG STATE STATE_STAMP
VERSION REPORTED UP UP_PRIMARY ACTING ACTING_PRIMARY
LAST_SCRUB SCRUB_STAMP LAST_DEEP_SCRUB DEEP_SCRUB_STAMP
SNAPTRIMQ_LEN
2.4e 2 0 0 0 0
8388608 0 0 2 2 active+clean 2022-05-20
02:13:47.432104 61'2 75:40 [10,0,7] 10 [10,0,7]
10 0'0 2022-05-20 01:44:36.217286 0'0 2022-05-20
01:44:36.217286 0
after setting osd.10 crush weight to 0 (behavior is correct, osd.10 is
not listed, not used):
root@nautilus1:~# ceph pg dump pgs | head -2
PG_STAT OBJECTS MISSING_ON_PRIMARY DEGRADED MISPLACED UNFOUND BYTES
OMAP_BYTES* OMAP_KEYS* LOG DISK_LOG STATE
STATE_STAMP VERSION REPORTED UP UP_PRIMARY
ACTING ACTING_PRIMARY LAST_SCRUB SCRUB_STAMP
LAST_DEEP_SCRUB DEEP_SCRUB_STAMP SNAPTRIMQ_LEN
2.4e 14 0 0 0 0
58720256 0 0 18 18 active+clean 2022-05-20
02:18:59.414812 75'18 80:43 [22,0,7] 22
[22,0,7] 22 0'0 2022-05-20
01:44:36.217286 0'0 2022-05-20 01:44:36.217286 0
Now question is if is it some implemented feature?
Or is it a bug?
Thank you!
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx