Re: [PATCH v3 31/38] nfsd: Move the open owner hash table into struct nfs4_client

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 03 Aug 2014 22:15:05 +0800
Kinglong Mee <kinglongmee@xxxxxxxxx> wrote:

> On 8/3/2014 19:35, Jeffrey Layton wrote:
> > On Sat, Aug 2, 2014 at 9:59 PM, Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx <mailto:trond.myklebust@xxxxxxxxxxxxxxx>> wrote:
> > 
> >     On Sat, Aug 2, 2014 at 6:59 PM, Jeff Layton <jeff.layton@xxxxxxxxxxxxxxx <mailto:jeff.layton@xxxxxxxxxxxxxxx>> wrote:
> >     > On Sat, 2 Aug 2014 10:47:27 -0400
> >     > Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx <mailto:trond.myklebust@xxxxxxxxxxxxxxx>> wrote:
> >     >
> >     >> On Sat, Aug 2, 2014 at 10:20 AM, Kinglong Mee <kinglongmee@xxxxxxxxx <mailto:kinglongmee@xxxxxxxxx>> wrote:
> >     >> > On 8/2/2014 22:05, Trond Myklebust wrote:
> >     >> >> On Sat, Aug 2, 2014 at 9:43 AM, Kinglong Mee <kinglongmee@xxxxxxxxx <mailto:kinglongmee@xxxxxxxxx>> wrote:
> >     >> >>> On 8/2/2014 21:11, Trond Myklebust wrote:
> >     >> >>>> On Sat, Aug 2, 2014 at 6:39 AM, Kinglong Mee <kinglongmee@xxxxxxxxx <mailto:kinglongmee@xxxxxxxxx>> wrote:
> >     >> >>>>> On 7/30/2014 09:34, Jeff Layton wrote:
> >     >> >>>>>> From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx <mailto:trond.myklebust@xxxxxxxxxxxxxxx>>
> >     >> >>>>>>
> >     >> >>>>>> Preparation for removing the client_mutex.
> >     >> >>>>>>
> >     >> >>>>>> Convert the open owner hash table into a per-client table and protect it
> >     >> >>>>>> using the nfs4_client->cl_lock spin lock.
> >     >> >>>>>>
> >     >> >>>>>> Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx <mailto:trond.myklebust@xxxxxxxxxxxxxxx>>
> >     >> >>>>>> ---
> >     >> >>>>>>  fs/nfsd/netns.h     |   1 -
> >     >> >>>>>>  fs/nfsd/nfs4state.c | 187 ++++++++++++++++++++++++----------------------------
> >     >> >>>>>>  fs/nfsd/state.h     |   1 +
> >     >> >>>>>>  3 files changed, 86 insertions(+), 103 deletions(-)
> >     >> >>>>>>
> >     >> >>>>>> diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> >     >> >>>>>> index a71d14413d39..e1f479c162b5 100644
> >     >> >>>>>> --- a/fs/nfsd/netns.h
> >     >> >>>>>> +++ b/fs/nfsd/netns.h
> >     >> >>>>>> @@ -63,7 +63,6 @@ struct nfsd_net {
> >     >> >>>>>>       struct rb_root conf_name_tree;
> >     >> >>>>>>       struct list_head *unconf_id_hashtbl;
> >     >> >>>>>>       struct rb_root unconf_name_tree;
> >     >> >>>>>> -     struct list_head *ownerstr_hashtbl;
> >     >> >>>>>
> >     >> >>>>> I send a patch "NFSD: Rervert "knfsd: locks: flag NFSv4-owned locks"" before,
> >     >> >>>>> http://comments.gmane.org/gmane.linux.nfs/64382
> >     >> >>>>>
> >     >> >>>>> nfsd needs the hashtbl to find the lockowner for locking by owner from
> >     >> >>>>> fl->fl_owner stored in struct file_lock, but without nfs_client.
> >     >> >>>>
> >     >> >>>> Why? We're not currently doing that.
> >     >> >>>
> >     >> >>> Although not doing that right now, but there is a bug for getting the right ld_owner
> >     >> >>> in nfs4_set_lock_denied.
> >     >> >>>
> >     >> >>> If denying locks, vfs don't copy fl->fl_lmops to the returned file_lock, so that,
> >     >> >>> fl->fl_lmops always be NULL, nfsd never return the owner who holds the conflock.
> >     >> >>>
> >     >> >>> If we want fix this problem, needs finding the lockowner from struct file_lock.
> >     >> >>
> >     >> >> Do we really care enough about fixing nfs4_set_lock_denied enough to
> >     >> >> do so at the cost of reducing overall scalability of locking state?
> >     >> >
> >     >> > I just report this problem, don't think enough about the scalability.
> >     >> >
> >     >> >> We will always be faking up the clientid etc for local locks. Are
> >     >> >> there any clients out there that actually inspect the clientid on a
> >     >> >> result of NFS4ERR_DENIED and that will break if we give them a fake
> >     >> >> for non-local locks?
> >     >> >
> >     >> > Jeff has point the same problem of a non-nfs4_lockowner.
> >     >> > Maybe we should copy fl_lmops to conflock as before, nfsd can distinguish
> >     >> > the lockowner stored in struct file_lock by checking fl_lmops.
> >     >> >
> >     >>
> >     >> Alternatively, set a flag in fl_flags. Back in the days, we used to
> >     >> have a FL_NFSD, perhaps it is time to resurrect that?
> >     >>
> >     >
> >     > Would we need a similar flag for lockd too?
> >     >
> >     > I'm not sure a flag is the correct approach for this. A "fl_lmtype" field
> >     > or something similar might make sense, but I'm not sure that really
> >     > adds much over just ensuring that fl_lmops is set properly for these locks.
> > 
> >     It avoids adding an extra copy of fl_lmops that would only be useful
> >     to knfsd as a flag.
> > 
> >     > That said, we definitely will need to ensure that there are no harmful
> >     > effects from setting fl_lmops on a conflock container. I don't see any
> >     > right offhand, but it's probably reasonable to put something like that
> >     > in -next for a bit of soak time...
> > 
> >     The real problem here is that you are copying the lock, and then
> >     trying to dereference the copied pointer without ensuring that the
> >     thing the pointer is referencing still exists. Given that we're
> >     removing the global state mutex, then it is now perfectly possible for
> >     a competing knfsd thread to free the conflicting lock and the
> >     lockowner pointed to by fl->fl_owner before your thread hits
> >     nfs4_set_lock_denied.
> >     The check for fl_lmtype does nothing to fix that race (nor does FL_NFSD).
> > 
> >     Cheers
> >       Trond
> > 
> > 
> > Agreed. That's a significant bit of nastiness, and I think you're correct that the
> > client_mutex removal will exacerbate the problem. The only way I can see to
> > properly fix that would be to make it so that the conflock holds a reference to the
> > lockowner, and then put it when the conflock is freed.
> > 
> 
> I will check the reference to lockowner in nfsd, and try to simulate client-id expire.
> 

You shouldn't need to do anything special there. It should be
sufficient to simply take a reference to the lockowner when you copy
the data to the conflock, and then put that reference when you go to
free it.

> > That will require some new fl_ops or fl_lmops. It would also be good to clean up
> > conflock generation a bit, and turn it into a more explicit process.
> > __locks_copy_lock doesn't really fully describe what's going on there, IMO...
> 
> I will have a try.
> 
> thanks,
> Kinglong Mee

Thanks. For now, I'm going to drop the patch that you proposed
yesterday since Trond's quite correct that there's more work needed in
this area before that's really safe.

-- 
Jeff Layton <jlayton@xxxxxxxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux