Hello!There is a problem with blocking async posix lock enqueue in 2.6.22 and 2.6.23 kernels. Lock call to underlying FS is done just fine, but when fl_grant is called to inform lockd of succesful granting, nothing happens, and no reply to client is sent. The end result
is client reports that the server is not responding.I enabled dprintks in the code and I see that immediately after fl_grant, there is nlmsvc_grant_blocked message (after callback: label) printed. Then server not responding messages start, and after every message about "coulndn't create RPC handle for localhost" I see nlmsvc_grant_blocked "lockd: GRANTing blocked lock" message again with no activity
from underlying FS.I am attaching a reproducer that I have, it is quite simple actually. Take note, that path to file to lock is hardcoded, so adjust for your environment please. Lcoking should be performed on a file that resides on nfs client mountpoint.
I reproduced the problem with 2.6.22 and 2.6.23 with Lustre (I am working on adapting lustre
to async posix locks API) and GFS2.Setup is totally local, i.e. I have single node on which there is gfs (both server and client) (or lustre - just client, but that does not make any difference), nfs server and nfs client
that mounts exported gfs or lustre. Bye, Oleg
Attachment:
flock.c
Description: Binary data