Re: Is it possible to fix corrupted osd superblock?

Yury Z <aboutbus@xxxxxxxxx> · Fri, 1 Feb 2019 18:52:18 +0300

On Thu, 31 Jan 2019 23:27:21 +0000 (UTC)
Sage Weil <sage@xxxxxxxxxxxx> wrote:

> On Thu, 31 Jan 2019, Sage Weil wrote:
> > On Thu, 31 Jan 2019, Yury Z wrote:  
> > > Hi,
> > > 
> > > We've experimented with runing OSD's in docker containers. And
> > > got the situation when two OSD's started with the same block
> > > device. File locks inside mounted osd dir didn't catch that issue
> > > because mounted osd dirs where inside containers. So, we got
> > > corrupted osd_superblock at osd bluestore drive. And now OSD
> > > can't be started.  
> > 
> > AHA!  Someone else ran into this and it was a mystery to me how
> > this happened.  How did you identify locks as the culprit?  And can
> > you describe the situation that led to two competing containers
> > running ceph-osd?  
> 
> I looked into this a bit and I'm not sure competing docker containers 
> explains the issue.  The bluestore code takes a fcntl lock on the
> block device when it opens it before doing anything at all, and I
> *think* those should work just fine across the container boundaries.

As far as i can see, bluestore code takes a fcntl lock on the "fsid"
file inside osd dir, not block device. BlueStore::_lock_fsid method.
In our case, we have the same block device, but different osd dirs for
each ceph-osd docker container. So, they can't detect each other and
prevent simultaneous rw operations on the same block device.