Re: Slow ops on OSDs

Stefan Kooman <stefan@xxxxxx> · Tue, 6 Oct 2020 13:18:23 +0200

On 2020-10-06 13:05, Igor Fedotov wrote:
> 
> On 10/6/2020 1:04 PM, Kristof Coucke wrote:
>> Another strange thing is going on:
>>
>> No client software is using the system any longer, so we would expect
>> that all IOs are related to the recovery (fixing of the degraded PG).
>> However, the disks that are reaching high IO are not a member of the
>> PGs that are being fixed.
>>
>> So, something is heavily using the disk, but I can't find the process
>> immediately. I've read something that there can be old client
>> processes that keep on connecting to an OSD for retrieving data for a
>> specific PG while that PG is no longer available on that disk.
>>
>>
> I bet it's rather PG removal happening in background....

^^ This, and probably the accompanying RocksDB housekeeping that goes
with it. As only removing PGs shouldn't be a too big a deal at all.
Especially with very small files (and a lot of them) you probably have a
lot of OMAP / META data, (ceph osd df will tell you).

If that's indeed the case than there is a (way) quicker option to get
out of this situation: offline compacting of the OSDs. This process
happens orders of magnitude faster than when the OSDs are still online.

To check if this hypothesis is true: are the OSD servers under CPU
stress where the PGs were located previously (and not the new hosts)?

Offline compaction per host:

systemctl stop ceph-osd.target

for osd in `ls /var/lib/ceph/osd/`; do (ceph-kvstore-tool bluestore-kv
/var/lib/ceph/osd/$osd compact &);done

Gr. Stefan
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx