Hi Frédéric, Can you try the below command?
$ rados -p mailbox listsnaps rbd_data.26f7c5d05af621.0000000000002adf rbd_data.26f7c5d05af621.0000000000002adf: cloneid snaps size overlap 3 3 4194304 [] head - 4194304 <---- Do you see this line?
$ rados -p mailbox listsnaps rbd_data.26f7c5d05af621.0000000000002adf rbd_data.26f7c5d05af621.0000000000002adf: cloneid snaps size overlap 27534 27523,27534 4194304 [0~1904640] 27553 27553 4194304 [0~1904640] 27636 27599,27636 4194304 [368640~3825664] 27673 27673 4194304 [1126400~3067904] 27710 27710 4194304 [1646592~2547712] 27721 27721 4194304 [1634304~2560000] 27732 27732 4194304 [1695744~2498560] 27743 27743 4194304 [1695744~2498560] 27780 27780 4194304 []
And
$ rados -p cephfs_data_ec listsnaps 100198218f2.0000008a 100198218f2.0000008a: cloneid snaps size overlap 4762 4705 4194304 []
If you don't see the 'head' line, then you're probably facing the orphan clones AKA leaked snapshots bug described here [1], that was fixed by [2].
To get rid of these orphan clones, you need run to run the below command on the pool that we requeue orphan objects for being snap trimmed. See [3] for details.
$ ceph osd pool force-remove-snap cephfs_data_ec
Note that :
1. there's a --dry-run option you can use. 2. This command should only be run ONCE. Running it twice or more is useless and can lead to OSDs crashing (restarting fine, but still... crashing at the same time).
This sounds a little bit scary to be honest. Do you mean osds can crash once and restart fine or do they crash, restart, crash, … ? Let us know how it goes. Cheers, Frédéric. [1] https://tracker.ceph.com/issues/64646 [2] https://github.com/ceph/ceph/pull/55841 [3] https://github.com/ceph/ceph/pull/53545 ----- Le 31 Jan 25, à 9:54, Frédéric Nass frederic.nass@xxxxxxxxxxxxxxxx a écrit : Hi Felix,
This is weird. The most likely explanation is that this resulted from a bug that has certainly been fixed.
What you try could as a starter is a ceph-bluestore-tool fsck [1] over the primary OSD of rbd_data.26f7c5d05af621.0000000000002adf to see if it lists any inconsistencies:
$ cephadm shell --name osd.$i --fsid $(ceph fsid) ceph-bluestore-tool fsck --path /var/lib/ceph/osd/ceph-$i/ --deep yes
If it doesn't then I suppose you'll have to get rid of these objects you can't 'rados rm' with the help of ceph-objectstore-tool <remove> [2].
Regards, Frédéric.
[1] https://docs.ceph.com/en/latest/man/8/ceph-bluestore-tool/ [2] https://docs.ceph.com/en/latest/man/8/ceph-objectstore-tool/#removing-an-object
----- Le 31 Jan 25, à 8:29, Felix Stolte <f.stolte@xxxxxxxxxxxxx> a écrit :
Hi Frederic, thanks for you suggestions. I took a look at all the objects and discovered the following:
1. rbd_id.<image_name>, rbd_header.<image_id> and rbd_object_map.<image_id> exist only for the 3 images listed by ‚rbd ls‘ (i created a test image yesterday).
2. There are about 30 different image_ids with rbd_data.<image_id> Objects which do not have an id, header oder object_map object
After that i tried to get the stats of one of the orphaned objects:
rados -p mailbox stat rbd_data.26f7c5d05af621.0000000000002adf error stat-ing mailbox/rbd_data.26f7c5d05af621.0000000000002adf: (2) No such file or directory I double checked, that object name is the one listed by 'rados ls‘. What makes it worse is that i neither can stat, get or rm the objects while they still are counted for disk usage. We will remove the whole pool for sure, but i really like to get to the cause of this to prevent it from happening again.
Am 30.01.2025 um 10:54 schrieb Frédéric Nass <frederic.nass@xxxxxxxxxxxxxxxx>:
Hi Felix,
Every rbd_data object belongs to an image that should have :
- an rbd_id.<image_name> object containing the image id (that you can get with rbd info) - an rbd_header object with omap atrributes you can list with listomapvals
To identify the image name these rbd_data object belong(ed) to you could list all rbd_id objects in that pool and, for each one of them, print the image id and the image name with the below command:
$ for rbd_id in $(rados -p $poolname ls | grep rbd_id) ; do echo "$(echo $rbd_id | cut -d '.' -f2) : $(rados -p $poolname get $rbd_id - | strings)" ; done image2 : 2a733b30debc84 image1 : 28d4fc1dddd922
It might take some time but you'd get a clearer view of what these rbd_data objects refer(ed) to. Also, if you can decode the timestamps in 'rados -p rbd listomapvals rbd_header.<id> | strings' output, you could know when each image was created and accessed for the last time.
Hope that helps.
Regards, Frédéric.
PS: If you're moving away from iSCSI and only have 2 remaining images in this pool, you may also wait until these images are no longer in use and then detach them and remove the whole pool.
----- Le 30 Jan 25, à 9:09, Felix Stolte <f.stolte@xxxxxxxxxxxxx> a écrit :
Hi Frederic, there is no namespace. The pool in question has the application rbd, but is not the default pool named ‚rbd'
Am 29.01.2025 um 11:24 schrieb Frédéric Nass <frederic.nass@xxxxxxxxxxxxxxxx>:
Hi Felix,
Any RADOS namespaces in that pool? You can check using either:
rbd namespace ls rbd
or
rados stat -p rbd rbd_namespace && rados -p rbd listomapvals rbd_namespace
The rbd_data objects might be linked to namespaced images that can only be listed using the command: rbd ls --namespace <namespace> I suggest checking this because the 'rbd' pool has historically been Ceph's default RBD pool, long before iSCSI began using it (in its hardcoded implementation).
Might be worth checking this before taking any actions.
Regards, Frédéric.
----- Le 29 Jan 25, à 8:53, Felix Stolte f.stolte@xxxxxxxxxxxxx a écrit :
Hi Alexander,
trash is empty and rbd ls only lists two images with the prefix rbd_data.1af561611d24cf and rbd_data.ed93e6548ca56b
rados ls gives:
rbd_data.d1b81247165450.00000000000055d2 rbd_data.32de606b8b4567.0000000000012f2f rbd_data.ed93e6548ca56b.00000000000eef03 rbd_data.26f7c5d05af621.0000000000002adf ….
Am 28.01.2025 um 22:46 schrieb Alexander Patrakov <patrakov@xxxxxxxxx>:
Hi Felix,
A dumb answer first: if you know the image names, have you tried "rbd rm $pool/$imagename"? Or, is there any reason like concerns about iSCSI control data integrity that prevents you from trying that?
Also, have you checked the rbd trash?
On Tue, Jan 28, 2025 at 5:43 PM Stolte, Felix <f.stolte@xxxxxxxxxxxxx> wrote:
Hi guys,
we have a rbd pool we used for images exported via ceph-iscsi on a 17.2.7 cluster. The pool uses 10 times the diskspace i would suppose it should and after investigating we noticed a lot of rbd_data Objects which images are no longer present. I assume that the original images were deleted using the gwcli but not all Objects have been removed properly.
What would be the best/most secure way to get rid of these orphaned objects and reclaim the diskspace?
Best regards Felix
--------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------- Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDir Stefan Müller Geschäftsführung: Prof. Dr. Astrid Lambrecht (Vorsitzende), Karsten Beneke (stellv. Vorsitzender), Prof. Dr. Ir. Pieter Jansens --------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------
_______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx
-- Alexander Patrakov
mit freundlichem Gruß Felix Stolte
IT-Services
--------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------- Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDir Stefan Müller Geschäftsführung: Prof. Dr. Astrid Lambrecht (Vorsitzende), Karsten Beneke (stellv. Vorsitzender), Prof. Dr. Ir. Pieter Jansens --------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------
_______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx
mit freundlichem Gruß Felix Stolte
IT-Services
--------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------- Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDir Stefan Müller Geschäftsführung: Prof. Dr. Astrid Lambrecht (Vorsitzende), Karsten Beneke (stellv. Vorsitzender), Prof. Dr. Ir. Pieter Jansens --------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------
mit freundlichem Gruß Felix Stolte
IT-Services
--------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------- Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDir Stefan Müller Geschäftsführung: Prof. Dr. Astrid Lambrecht (Vorsitzende), Karsten Beneke (stellv. Vorsitzender), Prof. Dr. Ir. Pieter Jansens --------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------
_______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx
mit freundlichem Gruß Felix Stolte
IT-Services
--------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------- Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Stefan Müller Geschäftsführung: Prof. Dr. Astrid Lambrecht (Vorsitzende), Karsten Beneke (stellv. Vorsitzender), Prof. Dr. Ir. Pieter Jansens --------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------
|