On Thu, 31 Jan 2019 23:27:21 +0000 (UTC) Sage Weil <sage@xxxxxxxxxxxx> wrote: > On Thu, 31 Jan 2019, Sage Weil wrote: > > On Thu, 31 Jan 2019, Yury Z wrote: > > > Hi, > > > > > > We've experimented with runing OSD's in docker containers. And > > > got the situation when two OSD's started with the same block > > > device. File locks inside mounted osd dir didn't catch that issue > > > because mounted osd dirs where inside containers. So, we got > > > corrupted osd_superblock at osd bluestore drive. And now OSD > > > can't be started. > > > > AHA! Someone else ran into this and it was a mystery to me how > > this happened. How did you identify locks as the culprit? And can > > you describe the situation that led to two competing containers > > running ceph-osd? > > I looked into this a bit and I'm not sure competing docker containers > explains the issue. The bluestore code takes a fcntl lock on the > block device when it opens it before doing anything at all, and I > *think* those should work just fine across the container boundaries. As far as i can see, bluestore code takes a fcntl lock on the "fsid" file inside osd dir, not block device. BlueStore::_lock_fsid method. In our case, we have the same block device, but different osd dirs for each ceph-osd docker container. So, they can't detect each other and prevent simultaneous rw operations on the same block device.