Re: Is it possible to fix corrupted osd superblock?

Alfredo Deza <adeza@xxxxxxxxxx> · Thu, 31 Jan 2019 11:01:20 -0500



On Thu, Jan 31, 2019 at 10:51 AM Sage Weil <sage@xxxxxxxxxxxx> wrote:
>
> On Thu, 31 Jan 2019, Yury Z wrote:
> > Hi,
> >
> > We've experimented with runing OSD's in docker containers. And got the
> > situation when two OSD's started with the same block device. File locks
> > inside mounted osd dir didn't catch that issue because mounted osd dirs
> > where inside containers. So, we got corrupted osd_superblock at osd
> > bluestore drive. And now OSD can't be started.
>
> AHA!  Someone else ran into this and it was a mystery to me how this
> happened.  How did you identify locks as the culprit?  And can you
> describe the situation that led to two competing containers running
> ceph-osd?
>
> > # /usr/bin/ceph-osd -d --cluster ceph --id 74
> > 2019-01-31 15:12:31.889211 7f6ae7fdee40 -1
> > bluestore(/var/lib/ceph/osd/ceph-74) _verify_csum bad crc32c/0x1000
> > checksum at blob offset 0x0, got 0xd4daeff6, expected 0xda9c1ef0,
> > device location [0x4000~1000], logical extent 0x0~1000, object
> > #-1:7b3f43c4:::osd_superblock:0#
> > 2019-01-31 15:12:31.889227 7f6ae7fdee40 -1 osd.74 0 OSD::init() :
> > unable to read osd superblock
> > 2019-01-31 15:12:32.508923 7f6ae7fdee40 -1  ** ERROR: osd init failed:
> > (22) Invalid argument
> >
> > We've tried to fix it with ceph bluestore tool, but it didn't help.
> >
> > # ceph-bluestore-tool repair --deep 1 --path /var/lib/ceph/osd/ceph-74
> > repair success
> >
> > Is it possible to fix corrupted osd superblock?
>
> Maybe.. it's hard to tell.  I think teh next step is to add a bluestore
> option to warn on crcc errors but to ignore them.  With that option set,
> we can run a fsck on the OSD to see how much damage there really is, and
> potentially export critical PGs that you need to recover.
>
> What version of Ceph are you running?

Would be interesting to know if the OSD was deployed using ceph-disk
or ceph-volume... We do have a bunch of checks in ceph-volume
that would prevent such a situation, I don't think ceph-disk is as thorough
>
> sage