Which Ceph version are you running, is mclock active? Joachim ___________________________________ Clyso GmbH - Ceph Foundation Member Am 21.03.23 um 06:53 schrieb Gauvain Pocentek:
Hello all, We have an EC (4+2) pool for RGW data, with HDDs + SSDs for WAL/DB. This pool has 9 servers with each 12 disks of 16TBs. About 10 days ago we lost a server and we've removed its OSDs from the cluster. Ceph has started to remap and backfill as expected, but the process has been getting slower and slower. Today the recovery rate is around 12 MiB/s and 10 objects/s. All the remaining unclean PGs are backfilling: data: volumes: 1/1 healthy pools: 14 pools, 14497 pgs objects: 192.38M objects, 380 TiB usage: 764 TiB used, 1.3 PiB / 2.1 PiB avail pgs: 771559/1065561630 objects degraded (0.072%) 1215899/1065561630 objects misplaced (0.114%) 14428 active+clean 50 active+undersized+degraded+remapped+backfilling 18 active+remapped+backfilling 1 active+clean+scrubbing+deep We've checked the health of the remaining servers, and everything looks like (CPU/RAM/network/disks). Any hints on what could be happening? Thank you, Gauvain _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx