Re: flock is held after ceph-osd daemon being stopped

Yiming Zhang <yzhan298@xxxxxxxx> · Tue, 18 Feb 2020 09:14:09 -0800

> On Feb 18, 2020, at 4:11 AM, Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> 
> On Fri, 2020-02-14 at 07:13 -0800, Yiming Zhang wrote:
>>> On Feb 13, 2020, at 3:52 AM, Jeff Layton <jlayton@xxxxxxxxxx> wrote:
>>> 
>>> If the OSD daemon dies, then it will have closed all of its fd's and
>>> there should be no more lock. Therefore you almost certainly have some
>>> other process running that is holding the lock.
>>> 
>>> You may have to do a bit of digging in /proc/locks. Determine the
>>> dev+inode number of the file on which the lock is being set and find it
>>> in /proc/locks. Then you can track down the PID that's holding that
>>> lock.
>>> 
>> I have checked the locks with lslocks, here is the locks when I vstarted ceph (bluestore block = /dev/sdc where sdc is a raw device):
>> COMMAND           PID  TYPE SIZE MODE  M START END PATH
>> ceph-mgr        19852 POSIX      WRITE 0     0   0 /...
>> iscsid           1061 POSIX      WRITE 0     0   0 /run...
>> ceph-mgr        14889 POSIX      WRITE 0     0   0 /...
>> rpcbind           990 FLOCK      WRITE 0     0   0 /run...
>> ceph-mon        16430 POSIX      WRITE 0     0   0 /...
>> ceph-mon        16430 POSIX      WRITE 0     0   0 /...
>> ceph-mon        18107 POSIX      WRITE 0     0   0 /...
>> ceph-mon        18107 POSIX      WRITE 0     0   0 /...
>> ceph-mon        19711 POSIX      WRITE 0     0   0 /...
>> ceph-mon        19711 POSIX      WRITE 0     0   0 /...
>> ceph-mon        10495 POSIX      WRITE 0     0   0 /...
>> ceph-mon        10495 POSIX      WRITE 0     0   0 /...
>> ceph-mon        14748 POSIX      WRITE 0     0   0 /...
>> ceph-mon        14748 POSIX      WRITE 0     0   0 /...
>> cron             1085 FLOCK      WRITE 0     0   0 /run...
>> ceph-mgr        18247 POSIX      WRITE 0     0   0 /...
>> atd              1111 POSIX      WRITE 0     0   0 /run...
>> lvmetad           807 POSIX      WRITE 0     0   0 /run...
>> ceph-mgr        10635 POSIX      WRITE 0     0   0 /...
>> ceph-mgr        16571 POSIX      WRITE 0     0   0 /…
>> 
>> Then I kill all related processes and restart cluster, the error “_lock flock failed on /users/xxx/ceph/build/dev/osd0/block” persists. 
>> 
>> After the kill, locks are:
>> COMMAND           PID  TYPE SIZE MODE  M START END PATH
>> rpcbind         20267 FLOCK      WRITE 0     0   0 /run...
>> lvmetad         20266 POSIX      WRITE 0     0   0 /run…
>> 
>> The error happens in KernelDevice.cc:
>> int r = ::flock(fd_directs[WRITE_LIFE_NOT_SET], LOCK_EX | LOCK_NB);
>> Where r gives -1, and fd_directs[WRITE_LIFE_NOT_SET] will give 11, and WRITE_LIFE_NOT_SET is 0.
>> 
>> Any suggestions how to proceed with the issue? 
>> 
> 
> Sorry, no. Any lock set on a block device should show up in /proc/locks
> (as it uses the kernel's generic flock lock mechanism for local
> filesystems).
> 
> You may want to play with strace and verify that the error is coming
> from the kernel and that the program is attempting to set the lock on
> the file you think it is.
> 
> What kernel is this running on?

The kernel is 4.15.0-70-generic( I also has the same issue on another kernel 4.15.18-041518-generic). I used the strace to track the issue, and it led to this paticular function _lock in KernelDevice (`r = _lock();` in KernelDevice::open function). If I commented it out, the error goest away. But it’s not a fix.
Maybe there is a bug here. I’ll keep digging this.

Thanks,
-ym

> -- 
> Jeff Layton <jlayton@xxxxxxxxxx>
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx