Have you tried to hexdump the actual NVMEs devices instead of the testdisk images? `hexdump -C -n 4096 /dev/nvmexxx` should show LVM LABELONE with PV UUID and `hexdump -C -n 4096 /dev/ceph-454751de-44ab-4aa6-b3ae-50abc22250b3/osd-block-b7745d63-0bf8-4ba4-9274-e034f1c15d7b` should show bluestore metadata AKA 'label' that you can get with `ceph-bluestore-tool show-label` If nothing shows up than you may need to rewrite the label as described here [1] by Igor. Thing is, you'd have to make sure you're dealing with the right disk, meaning both NVMEs were not swapped. To help you with that part, you could run `strings /dev/nvmexxx | less` and try to link the VG informations with OSD config files if you're able to. Hope that helps. Regards, Frédéric. [1] https://www.spinics.net/lists/ceph-users/msg81813.html ----- Le 3 Sep 24, à 11:56, Marco Faggian <m@xxxxxxxxxxxxxxxx> a écrit : > Hi Frédéric, > Thanks a lot for the pointers! > So, using testdisk I’ve created images of both the LVs. I’ve looked at the > hexdump and it’s filled with 0x00 until 00a00000. > Then for curiosity I’ve compared them and they’re identical until byte 12726273. > Also unfortunately the issue is that ceph-bluestore-tool show-label, like > dumpe2fs -h /dev/ceph-3.. are both erroring out in the same way: > unable to read label for > /dev/ceph-454751de-44ab-4aa6-b3ae-50abc22250b3/osd-block-b7745d63-0bf8-4ba4-9274-e034f1c15d7b: > 2024-09-03T11:27:22.938+0200 7f2d89f19a00 -1 > bluestore(/dev/ceph-454751de-44ab-4aa6-b3ae-50abc22250b3/osd-block-b7745d63-0bf8-4ba4-9274-e034f1c15d7b) > _read_bdev_label unable to decode label at offset 102: void > bluestore_bdev_label_t::decode(ceph::buffer::v15_2_0::list::const_iterator&) > decode past end of struct encoding: Malformed input [buffer:3] > (2) No such file or directory > Unfortunately the thread on the tracker doesn’t seem to point to a solution, > even though the issue seems to be identical. > The hexdump explains why it’s not finding the label, might it be that the LV is > not correctly mapped? > Basically here the question is: is there a way to recover the data of an OSD in > an LV, if it was ceph osd purge before the cluster had a chance to replicate it > (after ceph osd out )? > Thanks for your time! > fm >> On 3 Sep 2024, at 10:35, Frédéric Nass <frederic.nass@xxxxxxxxxxxxxxxx> wrote: >> Hi Marco, >> Have you checked the output of: >> dd if=/dev/ceph-xxxxxxx/osd-block-xxxxxxxxx of=/tmp/foo bs=4K count=2 >> hexdump -C /tmp/foo >> and: >> /usr/bin/ceph-bluestore-tool show-label --log-level=30 --dev /dev/nvmexxx -l >> /var/log/ceph/ceph-volume.log >> to see if it's aligned with OSD's metadata. >> You may also want to check this discussion [1] and this tracker [2] for useful >> commands. >> Regards, >> Frédéric >> [1] https://marc.info/?l=ceph-users&m=171395775626007&w=2 >> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1755956 >> ----- Le 2 Sep 24, à 11:26, m@xxxxxxxxxxxxxxxx a écrit : >>>> FYI: Also posted in L1Techs forum: >>>> https://forum.level1techs.com/t/recover-bluestore-osd-in-ceph-cluster/215715. >>> ## The epic intro >>> Through self-inflicted pain, I’m writing here to ask for volunteers in the >>> journey of recovering the lost partitions housing CephFS metadata pool. >>> ## The setup >>> 1 proxmox host (I know) >>> 1 replication rule only for NVMes (2x OSD) >>> 1 replication rule only for HDDs (8x OSD) >>> Each with failure domain to osd. >>> Each OSD configured to use bluestore backend in an LVM. >>> No backup (I know, I know). >>> ## The cause (me) >>> Long story short: I needed the PCIe lanes and decided to remove the two NVMEs >>> that were hosting the metadata pool for CephFS and .mgr pool. I proceeded to >>> remove the two OSDs (out and destroy). >>> This is where I’ve done goof: I didn’t change the replication rule to HDDs’ one, >>> so the cluster never moved the PGs stored in the NVMes, to the HDDs. >>> ## What I’ve done untill now >>> 1. Re-seated the NVMes to their original place. >>> 2. Found out that the LVM didn’t have the OSD’s labels applied >>> 3. Forced the backed-up LVM config to the two NVMes (thanks to the holy entity >>> that thought that archiving LVM config was a good thing, it payed back) >>> 4. Trying ceph-volume lvm activate 8 <id> to find out that it’s unable to decode >>> label at offset 102 in the LVM for that ODS. >>> ## Wishes >>> 1. Does anyone know a way to recover what I feel is a lost partition, given that >>> the “file system” is ceph’s bluestore? >>> 2. Is there a way to know, if it is, how the partition has been nuked? And >>> possibly find a way to reverse that process. >>> ## Closing statement >>> Eternal reminder: If you don’t want to lose it, back it up. >>> Thanks for your time, to the kind souls that are willing to die on this hill >>> with me, or come up victorious! >>> _______________________________________________ >>> ceph-users mailing list -- ceph-users@xxxxxxx >>> To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx