Re: rbd live migration recovery

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Right,

To answer my own question, after reading https://cephdocs.s3-website.cern.ch/ops/rbd_troubleshooting.html , I tried to hexedit the rbd_trash omapvalue of the image in question.

Looking at the source for TrashImageSpec::encode ( https://github.com/ceph/ceph/blob/6fee777d603aebce492c57b41f3b5760d50ddb07/src/cls/rbd/cls_rbd_types.cc, line 1102) it looked like the "trash source" to be at offset #7.

Changed the 02 (TRASH_IMAGE_SOURCE_MIGRATION) in omap value to 00 (TRASH_IMAGE_SOURCE_USER), put the omap value back and that got me my image back.

So if anyone stumbles on this, here's a brief rundown of the commands used:

root@r530-2:~# rbd -p VMpool trash list --all
892bd7086688e vm-206087-disk-1
root@r530-2:~# rados -p VMpool getomapval rbd_trash id_892bd7086688e trash_key
root@r530-2:~# hexedit trash_key
( byte #7 , was 02, change to 00, save)
root@r530-2:~# rados -p VMpool setomapval rbd_trash id_892bd7086688e --input-file trash_key
root@r530-2:~# rbd -p VMpool trash list
892bd7086688e vm-206087-disk-1
(no --all necessary)
root@r530-2:~# rbd -p VMpool trash restore 892bd7086688e
root@r530-2:~# rbd -p VMpool list -l
NAME SIZE PARENT FMT PROT LOCK
vm-206087-disk-1 2 TiB 2
vm-206087-disk-1@T202206211649 2 TiB 2
vm-206087-disk-1@T202206221655 2 TiB 2
vm-206087-disk-1@T202206231620 2 TiB 2
vm-206087-disk-1@T202206241459 2 TiB 2
vm-206087-disk-1@T202206251554 2 TiB 2

Cheers,

-Kostas



On 09/07/2022 20.51, koukou73gr wrote:


Hello,

I was playing around with rbd migration and I just happened to interrupt the prepare step. That is I hit ctr-c while rbd migration prepare was running.

Now I am left with a half baked migration target (it has half the source snapshots) and a migration source which sits in trash.

Aborting the migration results in error:

root@pe2950-1:~# rbd migration abort VMpoolEC/vm-206087-disk-1 2022-07-09T20:19:47.474+0300 7f52b4cc1340 -1 librbd::Migration: open_images: failed retrieving migration header: (22) Invalid argument Abort image migration: 0% complete...failed.

Restoring from trash results in error as well:

root@pe2950-1:~# rbd trash ls VMpool --all
892bd7086688e vm-206087-disk-1
root@pe2950-1:~# rbd trash restore VMpool/892bd7086688e
rbd: restore error: 2022-07-09T20:29:17.596+0300 7f301643e340 -1 librbd::api::Trash: restore: Current trash source 'migration' does not match expected: user,mirroring,unknown (4)(22) Invalid argument

I can't seem to find a way out of this situation in the docs. Is there something I can do? The cluster is for testing and the data can be discarded but it would be good to know if interrupting a step during rbd migration is a huge no-no.

Thanks in advance,

-Kostas

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux