Re: The journey to CephFS metadata pool’s recovery

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Marco,

Have you checked the output of:

dd if=/dev/ceph-xxxxxxx/osd-block-xxxxxxxxx of=/tmp/foo bs=4K count=2
hexdump -C /tmp/foo

and:

/usr/bin/ceph-bluestore-tool show-label --log-level=30 --dev /dev/nvmexxx -l /var/log/ceph/ceph-volume.log

to see if it's aligned with OSD's metadata.

You may also want to check this discussion [1] and this tracker [2] for useful commands.

Regards,
Frédéric

[1] https://marc.info/?l=ceph-users&m=171395775626007&w=2
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1755956

----- Le 2 Sep 24, à 11:26,  m@xxxxxxxxxxxxxxxx a écrit :

>> FYI: Also posted in L1Techs forum:
>> https://forum.level1techs.com/t/recover-bluestore-osd-in-ceph-cluster/215715.
> 
> ## The epic intro
> Through self-inflicted pain, I’m writing here to ask for volunteers in the
> journey of recovering the lost partitions housing CephFS metadata pool.
> 
> ## The setup
> 1 proxmox host (I know)
> 1 replication rule only for NVMes (2x OSD)
> 1 replication rule only for HDDs (8x OSD)
> Each with failure domain to osd.
> Each OSD configured to use bluestore backend in an LVM.
> No backup (I know, I know).
> 
> ## The cause (me)
> Long story short: I needed the PCIe lanes and decided to remove the two NVMEs
> that were hosting the metadata pool for CephFS and .mgr pool. I proceeded to
> remove the two OSDs (out and destroy).
> This is where I’ve done goof: I didn’t change the replication rule to HDDs’ one,
> so the cluster never moved the PGs stored in the NVMes, to the HDDs.
> 
> ## What I’ve done untill now
>  1. Re-seated the NVMes to their original place.
>  2. Found out that the LVM didn’t have the OSD’s labels applied
>  3. Forced the backed-up LVM config to the two NVMes (thanks to the holy entity
>  that thought that archiving LVM config was a good thing, it payed back)
>  4. Trying ceph-volume lvm activate 8 <id> to find out that it’s unable to decode
>  label at offset 102 in the LVM for that ODS.
> 
> ## Wishes
>  1. Does anyone know a way to recover what I feel is a lost partition, given that
>  the “file system” is ceph’s bluestore?
>  2. Is there a way to know, if it is, how the partition has been nuked? And
>  possibly find a way to reverse that process.
> 
> ## Closing statement
> Eternal reminder: If you don’t want to lose it, back it up.
> Thanks for your time, to the kind souls that are willing to die on this hill
> with me, or come up victorious!
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux