At 10:00 AM 5/20/2009, Tom Talpey wrote: >At 02:55 AM 5/20/2009, Rob Gardner wrote: >>Tom Talpey wrote: >>> At 04:43 PM 5/19/2009, Rob Gardner wrote: >>> >I've got a question about lockd in conjunction with a filesystem that >>> >provides its own (async) locking. >>> > >>> >After nlmsvc_lock() calls vfs_lock_file(), it seems to be that we might >>> >get the async callback (nlmsvc_grant_deferred) at any time. What's to >>> >stop it from arriving before we even put the block on the nlm_block >>> >list? If this happens, then nlmsvc_grant_deferred() will print "grant >>> >for unknown block" and then we'll wait forever for a grant that will >>> >never come. >>> >>> Yes, there's a race but the client will retry every 30 seconds, so it won't >>> wait forever. >>OK, a blocking lock request will get retried in 30 seconds and work out >>"ok". But a non-blocking request will get in big trouble. Let's say the > >A non-blocking lock doesn't request, and won't get, a callback. So I >don't understand... > >>callback is invoked immediately after the vfs_lock_file call returns >>FILE_LOCK_DEFERRED. At this point, the block is not on the nlm_block >>list, so the callback routine will not be able to find it and mark it as >>granted. Then nlmsvc_lock() will call nlmsvc_defer_lock_rqst(), put the >>block on the nlm_block list, and eventually the request will timeout and >>the client will get lck_denied. Meanwhile, the lock has actually been >>granted, but nobody knows about it. > >Yes, this can happen, I've seen it too. Again, it's a bug in the protocol >more than a bug in the clients. It gets even worse when retries occur. >If the reply cache doesn't catch the duplicates (and it never does), all >heck breaks out. > >> >>> Depending on the kernel client version, there are some >>> improvements we've tried over time to close the raciness a little. What >>> exact client version are you working with? >>> >> >>I maintain nfs/nlm server code for a NAS product, and so there is no >>"exact client" but rather a multitude of clients that I have no control >>over. All I can do is hack the server. We have been working around this > >I feel for ya (been there, done that) :-) > >>by using a semaphore to cover the vfs_lock_file() to >>nlmsvc_insert_block() sequence in nlmsvc_lock() and also >>nlmsvc_grant_deferred(). So if the callback arrives at a bad time, it >>has to wait until the lock actually makes it onto the nlm_block list, >>and so the status of the lock gets updated properly. > >Can you explain this further? If you're implementing the server, how do >you know your callback "arrives at a bad time", by the DENIED result >from the client? > >Another thing to worry about is the presence of NLM_CANCEL calls >from the client which cross the callbacks. > >I sent a patch which improves the situation at the client, some time >ago. Basically it was more willing to positively acknowledge a callback >which didn't match the nlm_blocked list, by also checking whether the >lock was actually being held. This was only half the solution however, >it didn't close the protocol race, just the client one. You want the >patch? I'll look for it. Found it, on the old nfs list: http://thread.gmane.org/gmane.linux.nfs/16611 Tom. > >> >>> Use NFSv4? ;-) >>> >> >>I had a feeling you were going to say that. ;-) Unfortunately that >>doesn't make NFSv3 and lockd go away. > >Yes, I know. Unfortunately there aren't any elegant solutions to >the NLM protocol's flaws. > >Tom. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html