On Fri, 1 Feb 2019, Yury Z wrote: > On Thu, 31 Jan 2019 23:27:21 +0000 (UTC) > Sage Weil <sage@xxxxxxxxxxxx> wrote: > > > On Thu, 31 Jan 2019, Sage Weil wrote: > > > On Thu, 31 Jan 2019, Yury Z wrote: > > > > Hi, > > > > > > > > We've experimented with runing OSD's in docker containers. And > > > > got the situation when two OSD's started with the same block > > > > device. File locks inside mounted osd dir didn't catch that issue > > > > because mounted osd dirs where inside containers. So, we got > > > > corrupted osd_superblock at osd bluestore drive. And now OSD > > > > can't be started. > > > > > > AHA! Someone else ran into this and it was a mystery to me how > > > this happened. How did you identify locks as the culprit? And can > > > you describe the situation that led to two competing containers > > > running ceph-osd? > > > > I looked into this a bit and I'm not sure competing docker containers > > explains the issue. The bluestore code takes a fcntl lock on the > > block device when it opens it before doing anything at all, and I > > *think* those should work just fine across the container boundaries. > > As far as i can see, bluestore code takes a fcntl lock on the "fsid" > file inside osd dir, not block device. BlueStore::_lock_fsid method. > In our case, we have the same block device, but different osd dirs for > each ceph-osd docker container. So, they can't detect each other and > prevent simultaneous rw operations on the same block device. The KernelDevice.cc *also* takes a lock on the block device itself, which should be the same inode across any containers. I'm trying to figure out why that lock isn't working, though... :/ sage