Ceph 18: Unable to delete image after imcomplete migration "image being migrated"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Folks,

I'm running Ceph 18 with OpenStack for my lab (and home services) in a 3 node cluster on Ubuntu 22.04. I'm quite new to these platforms. Just learning. This is my build, for what it's worth: https://blog.rhysgoodwin.com/it/openstack-ceph-hyperconverged/

I got myself into some trouble as follows. This is the sequence of events:

I don't recall when but at some stage I must have tried an image migration from one pool to another. The source pool/image is infra-pool/sophosbuild I don't know what the target would have been. In any case on my travels, I found the infra-pool/sophosbuild image in the trash:
rhys@hcn03:/imagework# rbd trash ls --all infra-pool
65a87bb2472fe sophosbuild

I tried to delete it but got the following:

rhys@hcn03:/imagework# rbd trash rm infra-pool/65a87bb2472fe
2023-10-06T04:23:13.775+0000 7f28bbfff640 -1 librbd::image::RefreshRequest: image being migrated
2023-10-06T04:23:13.775+0000 7f28bbfff640 -1 librbd::image::OpenRequest: failed to refresh image: (30) Read-only file system
2023-10-06T04:23:13.775+0000 7f28bbfff640 -1 librbd::ImageState: 0x7f28a804b600 failed to open image: (30) Read-only file system
2023-10-06T04:23:13.775+0000 7f28a2ffd640 -1 librbd::image::RemoveRequest: 0x7f28a8000b90 handle_open_image: error opening image: (30) Read-only file system
rbd: remove error: (30) Read-only file systemRemoving image: 0% complete...failed.

Next, I tried to restore the image, and this also failed:
rhys@hcn03:/imagework:# rbd trash restore infra-pool/65a87bb2472fe
librbd::api::Trash: restore: Current trash source 'migration' does not match expected: user,mirroring,unknown (4)

Probably stupidly, I followed the steps in this post: https://www.spinics.net/lists/ceph-users/msg72786.html to change offset 07 from 02 (TRASH_IMAGE_SOURCE_MIGRATION) in omap value to 00(TRASH_IMAGE_SOURCE_USER)

After this I was able to restore the image successfully.
However, I still could not delete it:
rhys@hcn03:/imagework:# rbd rm infra-pool/sophosbuild
2023-10-06T05:52:30.708+0000 7ff5937fe640 -1 librbd::image::RefreshRequest: image being migrated
2023-10-06T05:52:30.708+0000 7ff5937fe640 -1 librbd::image::OpenRequest: failed to refresh image: (30) Read-only file system
2023-10-06T05:52:30.708+0000 7ff5937fe640 -1 librbd::ImageState: 0x564d3f83d680 failed to open image: (30) Read-only file system
Removing image: 0% complete...failed.rbd: delete error: (30) Read-only file system

I tried to abort the migration with: root@hcn03:/imagework# rbd migration abort infra-pool/sophosbuild
This took a few mins but failed at 99% (sorry, terminal scroll back lost)

So now I'm stuck, I don't know how to get rid of this image and while everything is otherwise healthy in the cluster, the dashboard is throwing errors when it tries to enumerate the images in that pool.

I'm considering migrating the good images off this pool and deleing the pool. But I don't even know if I'll be allowed to delete the pool while this issue is present.

Any advice would be much appreciated.

Kind regards,
Rhys
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux