Re: Dovecot and fnctl locks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Dan,

one of our customers reported practically the same issue with fnctl locks but no negative PIDs:

0807178332093/mailboxes/Spam/rbox-Mails/dovecot.index.log (WRITE lock held by pid 25164) 0807178336211/mailboxes/INBOX/rbox-Mails/dovecot.index.log (WRITE lock held by pid 8143)

These errors occured during failure tests where the underlying MDS servers were shutoff. Restarting dovecot was enough to get rid of the erros. The mounted dovecot directories are pinned to specific MDS daemons, the environment is not in production though. Since we saw these for the first time and the root cause was a disaster scenario we didn't really take the time to investigate, so I can't really share anything, just confirm it (for now), maybe this topic comes up again.

Regards,
Eugen



Zitat von Dan van der Ster <dan@xxxxxxxxxxxxxx>:

Hi,

Yeah the negative pid is interesting. AFAICT we use a negative pid to
indicate that the lock was taken on another host:

https://github.com/torvalds/linux/blob/master/fs/ceph/locks.c#L119
https://github.com/torvalds/linux/commit/9d5b86ac13c573795525ecac6ed2db39ab23e2a8

"Finally, we convert remote filesystems to present remote pids using
negative numbers. Have lustre, 9p, ceph, cifs, and dlm negate the remote
pid returned for F_GETLK lock requests."

The good news is that my colleagues managed to clear this filelock by
restarting dovecot on a couple nodes.
But I'm still curious if others have a nice way to debug such things.

Cheers, Dan


On Mon, Nov 9, 2020 at 8:11 PM Anthony D'Atri <anthony.datri@xxxxxxxxx> wrote:

Looks like a - in front of the 9605 — signed/unsigned int flern?

> On Nov 9, 2020, at 4:59 AM, Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote:
>
> Hi all,
>
> MDS version v14.2.11
> Client kernel 3.10.0-1127.19.1.el7.x86_64
>
> We are seeing a strange issue with a dovecot use-case on cephfs.
> Occasionally we have dovecot reporting a file locked, such as:
>
> Nov 09 13:55:00 dovecot-backend-00.cern.ch dovecot[27710]:
> imap(reguero)<23945><fRA6B6yznq68uE28>: Error: Mailbox Deleted Items:
> Timeout (180s) while waiting for lock for transaction log file
> /mail/users/r/reguero//mdbox/mailboxes/Deleted
> Items/dbox-Mails/dovecot.index.log (WRITE lock held by pid -9605)
>
> We checked all hosts that have mounted the cephfs -- there is no pid 9605.
>
> Is there any way to see who exactly created the lock? ceph_filelock
> has a client id, but I didn't find a way to inspect the
> cephfs_metadata to see the ceph_filelock directly.
>
> Otherwise, are other Dovecot/CephFS users seeing this? Did you switch
> to flock or lockfile instead of fnctlk locks?
>
> Thanks!
>
> Dan
>
> P.S. here is the output from print locks tool from the kernel client:
>
> Read lock:
>  Type: 1 (0: Read, 1: Write, 2: Unlocked)
>  Whence: 0 (0: start, 1: current, 2: end)
>  Offset: 0
>  Len: 1
>  Pid: -9605
> Write lock:
>  Type: 1 (0: Read, 1: Write, 2: Unlocked)
>  Whence: 0 (0: start, 1: current, 2: end)
>  Offset: 0
>  Len: 1
>  Pid: -9605
>
> and same file from a 15.2.5 fuse client :
>
> Read lock:
>  Type: 1 (0: Read, 1: Write, 2: Unlocked)
>  Whence: 0 (0: start, 1: current, 2: end)
>  Offset: 0
>  Len: 0
>  Pid: 0
> Write lock:
>  Type: 1 (0: Read, 1: Write, 2: Unlocked)
>  Whence: 0 (0: start, 1: current, 2: end)
>  Offset: 0
>  Len: 0
>  Pid: 0
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux