Question about nlmclnt_lock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I was looking at the code for nlmclnt_lock and wanted to ask a
question about how the Linux kernel client and the NLM 4 protocol
handle some errors around certain edge cases.

Specifically, I think there is a race condition around two threads of
the same program acquiring a lock, one of the threads being
interrupted, and the NFS client sending an unlock when none of the
program threads called unlock.

On NFS server machine S:
there exists an unlocked file F

On NFS client machine C:
in program P:
thread 1 tries to lock(F) with fd A
thread 2 tries to lock(F) with fd B

The Linux client will issue two NLM_LOCK calls with the same svid and
same range, because it uses the program id to map to an svid.

For whatever reason, assume the connection is broken (cable gets pulled etc...)
and `status = nlmclnt_call(cred, req, NLMPROC_LOCK);` fails.

The Linux client will retry the request, but at some point thread 1
receives a signal and nlmclnt_lock breaks out of its loop. Because the
Linux client request failed, it will fall through and go to the
out_unlock label, where it will want to send an unlock request.

Assume that at some point the connection is reestablished.

The Linux kernel client now has two outstanding lock requests to send
to the remote server: one for a lock that thread 2 is still trying to
acquire, and one for an unlock of thread 1 that failed and was
interrupted.

I'm worried that the Linux client may first send the lock request, and
tell thread 2 that it acquired the lock, and then send an unlock
request from the cancelled thread 1 request.

The server will successfully process both requests, because the svid
is the same for both, and the true server side state will be that the
file is unlocked.

One can talk about the wisdom of using multiple threads to acquire the
same file lock, but this behavior is weird, because none of the
threads called unlock.

I have experimented with reproducing this, but have not been
successful in triggering this ordering of events.

I've also looked at the code of in clntproc.c and I don't see a spot
where outstanding failed lock/unlock requests are checked while
processing lock requests?

Thanks,
-Jan



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux