Re: A couple OSDs not starting after host reboot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi chenhui,

there is still a work in progress to support multiple labels to avoid the issue (https://github.com/ceph/ceph/pull/55374). But this is of little help for your current case.

If your disk is fine (meaning it's able to read/write block at offset 0) you might want to try to recover the label using label from a different OSD sitting on a similar(!!that's important!!!) main device. One needs to update osd uuid, whoami and osd_key fields after copying though. Here is the step-by-step procedure:

1. Copy OSD label (4K data block at offset 0) from source OSD's main device to the same location on the broken one:

> dd if=<source_osd_block_device> of=<target_osd_block_device> count=1 bs=4096

2. Learn broken OSD uuid, N denotes broken OSD id:

> ceph report | grep '"osd": N' -A 1
                "osd": N,
                "uuid": "6a4ca4ab-6a43-473c-b09c-b13bdd9def5c",

3. Set obtained uuid to copied OSD osd label

> ceph-bluestore-tool --dev <target_osd_block_device> --command set-label-key -k osd_uuid -v 6a4ca4ab-6a43-473c-b09c-b13bdd9def5c

4. Update whoami field in the copied label

> ceph-bluestore-tool --dev <target_osd_block_device> --command set-label-key -k whoami -v N

5. learn osd's key

> ceph auth ls | grep osd.1 -A 2
osd.1
        key: AQDrvg9maKxvKxAAqAzqCeR6y0UqBSVIyDhppg==

6. Update osd_key field in the copied label

> ceph-bluestore-tool --dev <target_osd_block_device> --command set-label-key -k osd_key -v AQDrvg9maKxvKxAAqAzqCeR6y0UqBSVIyDhppg==

7. Prime OSD dir if it's been lost:

> ceph-bluestore-tool --dev <target_osd_block_device> --path <path-to-target-osd-folder> --command prime-osd-dir


At this point OSD should be able to start if corrupted label was the only problem.

Hope this helps,

Igor.


On 05/04/2024 05:50, xu chenhui wrote:
Hi,
Has there been any progress on this issue ?  is there  quick recover method? I have same problem with you that first 4k block of osd metadata is invalid. It will pay a heavy price to recreate osd.

Thanks.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux