Re: [PATCH] lockd: handle fl_grant callbacks with coalesced locks (RFC)

"J. Bruce Fields" <bfields@xxxxxxxxxxxx> · Mon, 24 Nov 2008 12:06:53 -0500

On Mon, Nov 24, 2008 at 10:33:13AM -0500, Jeff Layton wrote:
> On Fri, 21 Nov 2008 20:15:55 -0500
> "J. Bruce Fields" <bfields@xxxxxxxxxxxx> wrote:
> 
> > On Wed, Nov 19, 2008 at 04:37:03PM -0500, Jeff Layton wrote:
> > > Dave Teigland opened a bug stating that he was having some problems with
> > > DLM and lockd. Essentially, the problem boiled down to the fact that
> > > when you do a posix lock of a region that touches another lock, the VFS
> > > will coalesce the locks and return a lock that encompasses the whole
> > > range.
> > > 
> > > The problem here is that DLM then does a fl_grant callback to lockd with
> > > the new coalesced lock. The fl_grant callback then looks through all
> > > of the blocks and eventually returns -ENOENT since none match the
> > > coalesced lock.
> > 
> > Ugh.
> > 
> 
> My sentiments exactly...
> 
> > > I'm having a very hard time tracking down info about how the fl_grant
> > > callback is supposed to work. Is it OK to send an fl_grant callback
> > > with a lock that's larger than the one requested? If so, then lockd
> > > needs to account for that possibility. Also, what exactly is the
> > > purpose of the second arg on fl_grant ("conf" in nlmsvc_grant_deferred)?
> > 
> > It's only used in the case the lock failed, and it can optionally be set
> > to a copy of the conflicting lock.
> > 
> 
> Ok, good to know. At some point, a file in Documentation on these interfaces
> this might be a nice addition...

OK.  I've made a note to do that.  (Which doesn't mean it will happen
soon.)

> 
> > > What follows is a patch that changes nlmsvc_grant_deferred to account
> > > for the possibility that it may receive an fl_grant that has already
> > > been coalesced. It changes nlmsvc_grant_deferred to walk the entire
> > > list of blocks and grant any blocks that are covered by the range of
> > > the lock in the grant callback, if doing so will not conflict with an
> > > earlier grant.
> > 
> > Hm.  That might work.
> > 
> 
> It seems to with very basic, cursory testing, but it could be
> broken and I'm just not seeing it. This code looks like it's only used
> in a NLM over DLM setup though, so it's hard to comprehensively test
> it. This scheme is certainly more complex than the current code, though
> and I'm not sure I have everything right with it.

I still haven't looked at the code yet.  Probably the first thing I'd
check would whether the previous code depends on an assumption that each
grant request results in revisit of exactly one rpc.  I can't see a
problem.

> > > The patch is still very rough and is probably broken in subtle (and
> > > maybe overt) ways, but it fixes the reproducer that Dave provided. It's
> > > just intended as a starting point for discussion. Can anyone clarify how
> > > fl_grant is supposed to work? Who's wrong here? DLM or NLM?
> > 
> > I think this wasn't thought through, apologies.  (It was my
> > responsibility to make sure it was!)
> > 
> > I also occasionally think that it would be better to keep any actual
> > lock application the caller's responsibility, to the extent that's
> > possible--so fl_grant would just be a notification, and it would be up
> > to lockd to try the lock again and check the result.  That's more or
> > less the way it always worked before, and it seems a simpler model.
> > 
> > Some work in the filesystem would be required to make sure it would be
> > ready to return an answer on that second request.
> > 
> > I also think there's a problem with lock cancelling in the current code.
> > 
> 
> No worries, I think we can come up with something workable now that we
> understand the problem.
> 
> When I dug through the mailing list archives, the only thing I could
> find on this was this post by you:
> 
> http://lkml.org/lkml/2008/4/15/246
> 
> ...and in particular:
> 
> 	- With fl_grant the filesystem says: I'm giving you the final
> 	  result of the lock operation.  In particular, if I'm telling
> 	  you it succeeded, that means I've already granted you the
> 	  lock; don't ask me about it again.
> 
> ...that seems to contradict what you're suggesting above. I suppose we
> could consider changing NLM to use the .fl_notify interface, which you
> described as:
> 
> 	- With fl_notify the filesystem says: something just happened to
> 	  this lock.  I'm not guaranteeing anything, I'm just telling
> 	  you this would be a good time to try the lock again.  Do it
> 	  and maybe you'll get lucky!
> 
> ...is that what you were suggesting?

Yes.  But it would be a fundamental change.  We should try to salvage
the current approach first.

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html