On Fri, 2008-03-28 at 17:37 -0400, Peter Staubach wrote: > However, I think that nlmclnt_unlock() needs to wait until > the RPC is completed. It should do that now. See the call to rpc_wait_for_completion_task() in nlm_async_call() > The original problem was test12() in > the Connectathon testsuite, which would occasionally fail. > It would fail because the parent would kill the child process > (actually the child of the child) and immediately attempt to > grab the lock. This would fail because the child hadn't > completed releasing the lock yet. There were some timing > dependencies in test12() itself, which I eliminated, but then > discovered that this wouldn't solve the entire problem. (I > can send you the new version of test12(), if you wish.) So, at least in 2.6.25, the call to rpc_wait_for_completion_task() will exit only on a fatal signal. The problem in test12() is that there is a 'pre-existing condition', in that the parent signalled us with a SIGINT, and so the signal is set upon entry to the function. IOW: we might have to perform a similar trick to what do_coredump() does, and clear the TIF_SIGPENDING flag. I'm not sure if that is sufficient, but given that we're eliminating the calls to recalc_sigpending(), and that there should be no such calls left in the RPC code, I think we're OK. -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@xxxxxxxxxx www.netapp.com -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html