Re: Orphaned rbd_data Objects

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Felix, 

Response inline. 

----- Le 5 Fév 25, à 8:40, Felix Stolte <f.stolte@xxxxxxxxxxxxx> a écrit : 

> Hi Frédéric ,

>> Can you try the below command?

>> $ rados -p mailbox listsnaps rbd_data.26f7c5d05af621.0000000000002adf
>> rbd_data.26f7c5d05af621.0000000000002adf:
>> cloneidsnapssizeoverlap
>> 334194304[]
>> head-4194304 <---- Do you see this line?

> $ rados -p mailbox listsnaps rbd_data.26f7c5d05af621.0000000000002adf
> rbd_data.26f7c5d05af621.0000000000002adf:
> cloneidsnapssizeoverlap
> 2753427523,275344194304[0~1904640]
> 27553275534194304[0~1904640]
> 2763627599,276364194304[368640~3825664]
> 27673276734194304[1126400~3067904]
> 27710277104194304[1646592~2547712]
> 27721277214194304[1634304~2560000]
> 27732277324194304[1695744~2498560]
> 27743277434194304[1695744~2498560]
> 27780277804194304[]

> And

> $ rados -p cephfs_data_ec listsnaps 100198218f2.0000008a
> 100198218f2.0000008a:
> cloneidsnapssizeoverlap
> 476247054194304[]

>> If you don't see the 'head' line, then you're probably facing the orphan clones
>> AKA leaked snapshots bug described here [1], that was fixed by [2].

> Looks like we are affected by [ https://tracker.ceph.com/issues/64646 |
> https://tracker.ceph.com/issues/64646 ] on both pools. We are currently on
> 17.2.7, while the fix is in 17.2.8.

Yep. 

>> To get rid of these orphan clones, you need run to run the below command on the
>> pool that we requeue orphan objects for being snap trimmed. See [3] for
>> details.

>> $ ceph osd pool force-remove-snap cephfs_data_ec

>> Note that :

>> 1. there's a --dry-run option you can use.
>> 2. This command should only be run ONCE. Running it twice or more is useless and
>> can lead to OSDs crashing (restarting fine, but still... crashing at the same
>> time).

> This sounds a little bit scary to be honest. Do you mean osds can crash once and
> restart fine or do they crash, restart, crash, … ?

We've seen OSDs crashing and restarting just fine (no crash loop or whatsoever) when running the 'force-remove-snap' command on test pool ** that still had snapshots **. Corner case... 
So I would recommend removing all snapshots on these 2 pools (if still any) before you run the 'force-remove-snap' command. No worries, you should be fine. 

When running the 'force-remove-snap' command, you should see all PGs trimming snaps and the number of CLONES (on rados df output) going back to 0. 

Frédéric. 

>> Let us know how it goes.

>> Cheers,
>> Frédéric.

>> [1] https://tracker.ceph.com/issues/64646
>> [2] https://github.com/ceph/ceph/pull/55841
>> [3] https://github.com/ceph/ceph/pull/53545

>> ----- Le 31 Jan 25, à 9:54, Frédéric Nass frederic.nass@xxxxxxxxxxxxxxxx a écrit
>> :

>>> Hi Felix,

>>> This is weird. The most likely explanation is that this resulted from a bug that
>>> has certainly been fixed.

>>> What you try could as a starter is a ceph-bluestore-tool fsck [1] over the
>>> primary OSD of rbd_data.26f7c5d05af621.0000000000002adf to see if it lists any
>>> inconsistencies:

>>> $ cephadm shell --name osd.$i --fsid $(ceph fsid) ceph-bluestore-tool fsck
>>> --path /var/lib/ceph/osd/ceph-$i/ --deep yes

>>> If it doesn't then I suppose you'll have to get rid of these objects you can't
>>> 'rados rm' with the help of ceph-objectstore-tool <remove> [2].

>>> Regards,
>>> Frédéric.

>>> [1] https://docs.ceph.com/en/latest/man/8/ceph-bluestore-tool/
>>> [2]
>>> https://docs.ceph.com/en/latest/man/8/ceph-objectstore-tool/#removing-an-object

>>> ----- Le 31 Jan 25, à 8:29, Felix Stolte <f.stolte@xxxxxxxxxxxxx> a écrit :

>>>> Hi Frederic,
>>>> thanks for you suggestions. I took a look at all the objects and discovered the
>>>> following:

>>>> 1. rbd_id.<image_name>, rbd_header.<image_id> and rbd_object_map.<image_id>
>>>> exist only for the 3 images listed by ‚rbd ls‘ (i created a test image
>>>> yesterday).

>>>> 2. There are about 30 different image_ids with rbd_data.<image_id> Objects which
>>>> do not have an id, header oder object_map object

>>>> After that i tried to get the stats of one of the orphaned objects:

>>>> rados -p mailbox stat rbd_data.26f7c5d05af621.0000000000002adf
>>>> error stat-ing mailbox/rbd_data.26f7c5d05af621.0000000000002adf: (2) No such
>>>> file or directory
>>>> I double checked, that object name is the one listed by 'rados ls‘. What makes
>>>> it worse is that i neither can stat, get or rm the objects while they still are
>>>> counted for disk usage. We will remove the whole pool for sure, but i really
>>>> like to get to the cause of this to prevent it from happening again.

>>>>> Am 30.01.2025 um 10:54 schrieb Frédéric Nass <frederic.nass@xxxxxxxxxxxxxxxx>:

>>>>> Hi Felix,

>>>>> Every rbd_data object belongs to an image that should have :

>>>>> - an rbd_id.<image_name> object containing the image id (that you can get with
>>>>> rbd info)
>>>>> - an rbd_header object with omap atrributes you can list with listomapvals

>>>>> To identify the image name these rbd_data object belong(ed) to you could list
>>>>> all rbd_id objects in that pool and, for each one of them, print the image id
>>>>> and the image name with the below command:

>>>>> $ for rbd_id in $(rados -p $poolname ls | grep rbd_id) ; do echo "$(echo $rbd_id
>>>>> | cut -d '.' -f2) : $(rados -p $poolname get $rbd_id - | strings)" ; done
>>>>> image2 : 2a733b30debc84
>>>>> image1 : 28d4fc1dddd922

>>>>> It might take some time but you'd get a clearer view of what these rbd_data
>>>>> objects refer(ed) to. Also, if you can decode the timestamps in 'rados -p rbd
>>>>> listomapvals rbd_header.<id> | strings' output, you could know when each image
>>>>> was created and accessed for the last time.

>>>>> Hope that helps.

>>>>> Regards,
>>>>> Frédéric.

>>>>> PS: If you're moving away from iSCSI and only have 2 remaining images in this
>>>>> pool, you may also wait until these images are no longer in use and then detach
>>>>> them and remove the whole pool.

>>>>> ----- Le 30 Jan 25, à 9:09, Felix Stolte <f.stolte@xxxxxxxxxxxxx> a écrit :

>>>>>> Hi Frederic,
>>>>>> there is no namespace. The pool in question has the application rbd, but is not
>>>>>> the default pool named ‚rbd'

>>>>>>> Am 29.01.2025 um 11:24 schrieb Frédéric Nass <frederic.nass@xxxxxxxxxxxxxxxx>:

>>>>>>> Hi Felix,

>>>>>>> Any RADOS namespaces in that pool? You can check using either:

>>>>>>> rbd namespace ls rbd

>>>>>>> or

>>>>>>> rados stat -p rbd rbd_namespace && rados -p rbd listomapvals rbd_namespace

>>>>>>> The rbd_data objects might be linked to namespaced images that can only be
>>>>>>> listed using the command: rbd ls --namespace <namespace>
>>>>>>> I suggest checking this because the 'rbd' pool has historically been Ceph's
>>>>>>> default RBD pool, long before iSCSI began using it (in its hardcoded
>>>>>>> implementation).

>>>>>>> Might be worth checking this before taking any actions.

>>>>>>> Regards,
>>>>>>> Frédéric.

>>>>>>> ----- Le 29 Jan 25, à 8:53, Felix Stolte f.stolte@xxxxxxxxxxxxx a écrit :

>>>>>>>> Hi Alexander,

>>>>>>>> trash is empty and rbd ls only lists two images with the prefix
>>>>>>>> rbd_data.1af561611d24cf and rbd_data.ed93e6548ca56b

>>>>>>>> rados ls gives:

>>>>>>>> rbd_data.d1b81247165450.00000000000055d2
>>>>>>>> rbd_data.32de606b8b4567.0000000000012f2f
>>>>>>>> rbd_data.ed93e6548ca56b.00000000000eef03
>>>>>>>> rbd_data.26f7c5d05af621.0000000000002adf
>>>>>>>> ….

>>>>>>>> Am 28.01.2025 um 22:46 schrieb Alexander Patrakov <patrakov@xxxxxxxxx>:

>>>>>>>> Hi Felix,

>>>>>>>> A dumb answer first: if you know the image names, have you tried "rbd
>>>>>>>> rm $pool/$imagename"? Or, is there any reason like concerns about
>>>>>>>> iSCSI control data integrity that prevents you from trying that?

>>>>>>>> Also, have you checked the rbd trash?

>>>>>>>> On Tue, Jan 28, 2025 at 5:43 PM Stolte, Felix <f.stolte@xxxxxxxxxxxxx> wrote:

>>>>>>>> Hi guys,

>>>>>>>> we have a rbd pool we used for images exported via ceph-iscsi on a 17.2.7
>>>>>>>> cluster. The pool uses 10 times the diskspace i would suppose it should and
>>>>>>>> after investigating we noticed a lot of rbd_data Objects which images are no
>>>>>>>> longer present. I assume that the original images were deleted using the gwcli
>>>>>>>> but not all Objects have been removed properly.

>>>>>>>> What would be the best/most secure way to get rid of these orphaned objects and
>>>>>>>> reclaim the diskspace?

>>>>>>>> Best regards
>>>>>>>> Felix

>>>>>>>> ---------------------------------------------------------------------------------------------
>>>>>>>> ---------------------------------------------------------------------------------------------
>>>>>>>> Forschungszentrum Juelich GmbH
>>>>>>>> 52425 Juelich
>>>>>>>> Sitz der Gesellschaft: Juelich
>>>>>>>> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
>>>>>>>> Vorsitzender des Aufsichtsrats: MinDir Stefan Müller
>>>>>>>> Geschäftsführung: Prof. Dr. Astrid Lambrecht (Vorsitzende),
>>>>>>>> Karsten Beneke (stellv. Vorsitzender), Prof. Dr. Ir. Pieter Jansens
>>>>>>>> ---------------------------------------------------------------------------------------------
>>>>>>>> ---------------------------------------------------------------------------------------------

>>>>>>>> _______________________________________________
>>>>>>>> ceph-users mailing list -- ceph-users@xxxxxxx
>>>>>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx

>>>>>>>> --
>>>>>>>> Alexander Patrakov

>>>>>>>> mit freundlichem Gruß
>>>>>>>> Felix Stolte

>>>>>>>> IT-Services

>>>>>>>> ---------------------------------------------------------------------------------------------
>>>>>>>> ---------------------------------------------------------------------------------------------
>>>>>>>> Forschungszentrum Juelich GmbH
>>>>>>>> 52425 Juelich
>>>>>>>> Sitz der Gesellschaft: Juelich
>>>>>>>> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
>>>>>>>> Vorsitzender des Aufsichtsrats: MinDir Stefan Müller
>>>>>>>> Geschäftsführung: Prof. Dr. Astrid Lambrecht (Vorsitzende),
>>>>>>>> Karsten Beneke (stellv. Vorsitzender), Prof. Dr. Ir. Pieter Jansens
>>>>>>>> ---------------------------------------------------------------------------------------------
>>>>>>>> ---------------------------------------------------------------------------------------------

>>>>>>>> _______________________________________________
>>>>>>>> ceph-users mailing list -- ceph-users@xxxxxxx
>>>>>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx

>>>>>> mit freundlichem Gruß
>>>>>> Felix Stolte

>>>>>> IT-Services

>>>>>> ---------------------------------------------------------------------------------------------
>>>>>> ---------------------------------------------------------------------------------------------
>>>>>> Forschungszentrum Juelich GmbH
>>>>>> 52425 Juelich
>>>>>> Sitz der Gesellschaft: Juelich
>>>>>> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
>>>>>> Vorsitzender des Aufsichtsrats: MinDir Stefan Müller
>>>>>> Geschäftsführung: Prof. Dr. Astrid Lambrecht (Vorsitzende),
>>>>>> Karsten Beneke (stellv. Vorsitzender), Prof. Dr. Ir. Pieter Jansens
>>>>>> ---------------------------------------------------------------------------------------------
>>>>>> ---------------------------------------------------------------------------------------------

>>>> mit freundlichem Gruß
>>>> Felix Stolte

>>>> IT-Services

>>>> ---------------------------------------------------------------------------------------------
>>>> ---------------------------------------------------------------------------------------------
>>>> Forschungszentrum Juelich GmbH
>>>> 52425 Juelich
>>>> Sitz der Gesellschaft: Juelich
>>>> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
>>>> Vorsitzender des Aufsichtsrats: MinDir Stefan Müller
>>>> Geschäftsführung: Prof. Dr. Astrid Lambrecht (Vorsitzende),
>>>> Karsten Beneke (stellv. Vorsitzender), Prof. Dr. Ir. Pieter Jansens
>>>> ---------------------------------------------------------------------------------------------
>>>> ---------------------------------------------------------------------------------------------

>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users@xxxxxxx
>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx

> mit freundlichem Gruß
> Felix Stolte

> IT-Services

> ---------------------------------------------------------------------------------------------
> ---------------------------------------------------------------------------------------------
> Forschungszentrum Juelich GmbH
> 52425 Juelich
> Sitz der Gesellschaft: Juelich
> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
> Vorsitzender des Aufsichtsrats: MinDir Stefan Müller
> Geschäftsführung: Prof. Dr. Astrid Lambrecht (Vorsitzende),
> Karsten Beneke (stellv. Vorsitzender), Prof. Dr. Ir. Pieter Jansens
> ---------------------------------------------------------------------------------------------
> ---------------------------------------------------------------------------------------------
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux