Re: OSDs growing beyond full ratio

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/30/22 22:31, Wyll Ingersoll wrote:
One of our OSDs eventually reached 100% capacity (in spite of the full ratio being 95%).  Now it is down and we cannot restart the osd process on it because there is not enough space on the device.

Is there a way to find PGs on that disk that can be safely removed without destroying data so we can bring it back online?  This is a bluestore OSD.

You can export PGs and import them on another OSD. After successful import you can delete the PG on the source OSD. See https://docs.ceph.com/en/pacific/man/8/ceph-objectstore-tool/


export:

ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-$source-id --pgid $pgid --op export --file /path/to/export/drive/$pgid.dump

import:

ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-$dest-id --op import --file /path/to/export/drive/$pgid.dump

remove if successful:

ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-$source-id --pgid $pgid --op remove --force


I don't understand how this overfilling issue is not already a bug that is getting attention, it seems very broken that an OSD can blow way past its full_ratio.

Can you capture logs and cluster status? See this export tool, ceph-collect, by 42on to get useful data: https://github.com/42on/ceph-collect

Gr. Stefan
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux