Re: Deleting an rbd image hangs

Eugen Block <eblock@xxxxxx> · Tue, 08 May 2018 07:29:30 +0000

Hi,

I have a similar issue and would also need some advice how to get rid  
of the already deleted files.

Ceph is our OpenStack backend and there was a nova clone without  
parent information. Apparently, the base image had been deleted  
without a warning or anything although there were existing clones.
Anyway, I tried to delete the respective rbd_data and _header files as  
described in [1]. There were about 700 objects to be deleted, but 255  
objects remained according to the 'rados -p pool ls' command. The  
attempt to delete the rest (again) resulted (and still results) in "No  
such file or directory". After about half an hour later one more  
object vanished (rbd_header file), there are now still 254 objects  
left in the pool. First I thought maybe Ceph will cleanup itself, it  
just takes some time, but this was weeks ago and the number of objects  
has not changed since then.

I would really appreciate any help.

Regards,
Eugen

Zitat von Jan Marquardt <jm@xxxxxxxxxxx>:

Am 30.04.18 um 09:26 schrieb Jan Marquardt:
Am 27.04.18 um 20:48 schrieb David Turner:
This old [1] blog post about removing super large RBDs is not relevant
if you're using object map on the RBDs, however it's method to manually
delete an RBD is still valid.  You can see if this works for you to
manually remove the problem RBD you're having.

I followed the instructions, but it seems that 'rados -p rbd ls | grep
'^rbd_data.221bf2eb141f2.' | xargs -n 200  rados -p rbd rm' gets stuck,
too. It's running since Friday and still not finished. The rbd image
is/was about 1 TB large.

Until now the only output was:
error removing rbd>rbd_data.221bf2eb141f2.00000000000051d2: (2) No such
file or directory
error removing rbd>rbd_data.221bf2eb141f2.000000000000e3f2: (2) No such
file or directory

I am still trying to get rid of this. 'rados -p rbd ls' still shows a
lot of objects beginning with rbd_data.221bf2eb141f2, but if I try to
delete them with 'rados -p rbd rm <obj>' it says 'No such file or
directory'. This is not the behaviour I'd expect. Any ideas?

Besides this rbd_data.221bf2eb141f2.0000000000016379 is still causing
the OSDs crashing, which leaves the cluster unusable for us at the
moment. Even if it's just a proof of concept, I'd like to get this fixed
without destroying the whole cluster.

[1] http://cephnotes.ksperis.com/blog/2014/07/04/remove-big-rbd-image

On Thu, Apr 26, 2018 at 9:25 AM Jan Marquardt <jm@xxxxxxxxxxx
<mailto:jm@xxxxxxxxxxx>> wrote:

    Hi,

    I am currently trying to delete an rbd image which is seemingly causing
    our OSDs to crash, but it always gets stuck at 3%.

    root@ceph4:~# rbd rm noc_tobedeleted
    Removing image: 3% complete...

    Is there any way to force the deletion? Any other advices?

    Best Regards

    Jan

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com