Re: [PATCH 5/7] NFS: Convert lookups of the open context to RCU

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2018-10-04 at 11:49 -0400, Olga Kornievskaia wrote:
> On Thu, Oct 4, 2018 at 11:22 AM Trond Myklebust <
> trondmy@xxxxxxxxxxxxxxx> wrote:
> > 
> > Hi Olga,
> > 
> > On Wed, 2018-10-03 at 14:38 -0400, Olga Kornievskaia wrote:
> > > Hi Trond,
> > > 
> > > Here's why the ordering of the "open_files" list matters and
> > > changes/fixes the existing problem.
> > > 
> > > When we first open the file for writing and get a delegation,
> > > it's
> > > the
> > > first one on the list. When we opened the file again for the same
> > > mode
> > > type, then before the patch, the new entry is inserted before
> > > what's
> > > already on the list. Both of these files share the same
> > > nfs4_state
> > > that's marked delegated.
> > > 
> > > Once we receive a delegation recall, in delegation_claim_opens()
> > > we
> > > walk the list. First one will be the 2nd open. It's marked
> > > delegated
> > > but after calling nfs4_open_delegation_recall() the delegation
> > > flag
> > > is
> > > cleared. The 2nd open doesn't have the lock associated with it.
> > > So no
> > > lock is reclaimed. We now go to the 2nd entry in the open_file
> > > list
> > > which is the 1st open but now the delegation flag is cleared so
> > > we
> > > never recover the lock.
> > > 
> > > Any of the opens on the open_list can be the lock holder and we
> > > can't
> > > clear the delegation flag on the first treatment of the delegated
> > > open
> > > because it might not be the owner of the lock.
> > > 
> > > I'm trying to figure out how I would fix it but I thought I'd
> > > send
> > > this for your comments.
> > 
> > The expectation here is that once we find an open context with a
> > stateid that needs to be reclaimed or recovered, we recover _all_
> > the
> > state associated with that stateid.
> > IOW: the expectation is that we first look at the open state, and
> > (depending on whether this is a write delegation or a read
> > delegation)
> > run through a set of calls to nfs4_open_recover_helper() that will
> > recover all outstanding open state for this file.
> 
> That's true. I see that it will recover all the outstanding opens for
> this file.
> 
> > We then iterate through all the lock stateids for the file and
> > recover
> > those.
> 
> However this is not true. Because we pass in a specific
> nfs_open_context into the nfs_delegation_claim_locks() and while
> looping thru the list of locks for the file we compare if the open
> context associated with the file lock is same as the passed in
> context. The passed in context is that of the first nfs_open_context
> that was marked delegated and not necessarily the context that hold
> the locks. That's the problem.
> 
> While we are looping thru the locks for the file, we need to be
> checking against any and all the nfs_open_context that are associated
> with the file and recovering those locks. I'm still not sure how to
> do
> it.
> 

Interesting. Why are we checking that open context? I can't see any
reason why we would want to do that.

> > 
> > So why are we holding the open context, and not just pinning the
> > stateid while we perform the recovery? The main reason is to ensure
> > that we also pin the path. The stateid holds a reference to the
> > inode,
> > but recovery can require us to perform an OPEN based on path (e.g.
> > when
> > using NFS4_OPEN_CLAIM_DELEGATE_CUR). Hence the utility of the open
> > context, which carries a reference to a struct dentry.
> > 
> > > 
> > > On Fri, Sep 28, 2018 at 4:38 PM Trond Myklebust <
> > > trondmy@xxxxxxxxxxxxxxx> wrote:
> > > > 
> > > > On Fri, 2018-09-28 at 16:19 -0400, Olga Kornievskaia wrote:
> > > > > On Fri, Sep 28, 2018 at 4:07 PM Trond Myklebust <
> > > > > trondmy@xxxxxxxxxxxxxxx> wrote:
> > > > > > 
> > > > > > On Fri, 2018-09-28 at 15:55 -0400, Olga Kornievskaia wrote:
> > > > > > > On Fri, Sep 28, 2018 at 3:10 PM Olga Kornievskaia <
> > > > > > > aglo@xxxxxxxxx
> > > > > > > > 
> > > > > > > 
> > > > > > > wrote:
> > > > > > > > 
> > > > > > > 
> > > > > > > Wait, why are we suppose to reclaim the open state when
> > > > > > > we
> > > > > > > have a
> > > > > > > valid open stateid? We don't have any cached opens that
> > > > > > > server
> > > > > > > doesn't
> > > > > > > know about. RFC7530 says "if the file has other open
> > > > > > > reference",
> > > > > > > I
> > > > > > > think the emphasis is on "other". I don't believe we need
> > > > > > > to
> > > > > > > be
> > > > > > > sending anything besides the locks to the server. Then
> > > > > > > I'm
> > > > > > > back
> > > > > > > to
> > > > > > > square one.
> > > > > > 
> > > > > > Holding a delegation does not imply that we hold an open
> > > > > > stateid.
> > > > > > Under
> > > > > > Linux, the open stateid gets closed as soon as the
> > > > > > application
> > > > > > closes
> > > > > > the file.
> > > > > > 
> > > > > > The delegation, on the other hand, is retained until either
> > > > > > it
> > > > > > is
> > > > > > recalled, or we see that the file has not been used for 2
> > > > > > lease
> > > > > > periods.
> > > > > 
> > > > > Ok I agree with all of it but I'm saying it doesn't need to
> > > > > be
> > > > > reclaimed unconditionally or are you saying that's what the
> > > > > linux
> > > > > client does? In this test case, the file hasn't been closed
> > > > > or
> > > > > expired. I'm stating that the client has a valid open stateid
> > > > > and
> > > > > should only be required to reclaim the locks (which with this
> > > > > patch
> > > > > it
> > > > > does).
> > > > 
> > > > As I said earlier, the client is required to recover all
> > > > _cached_
> > > > open
> > > > and lock state. If it already holds an open stateid, then it
> > > > should
> > > > not
> > > > need to reclaim the open modes that are covered by that
> > > > stateid,
> > > > however it may still need to reclaim those open modes that were
> > > > not
> > > > already subject to an explicit OPEN call.
> > > > 
> > > > IOW: If the file was first opened with an open(O_RDRW) call by
> > > > the
> > > > application, but a second application then opened it using
> > > > open(O_WRONLY), then we may already hold a stateid with a
> > > > "SHARE_ACCESS_BOTH" open mode, however we will still need to
> > > > send a
> > > > reclaim for the cached SHARE_ACCESS_WRITE mode, so that a later
> > > > OPEN_DOWNGRADE(SHARE_ACCESS_WRITE) can succeed.
> > > > 
> > > > --
> > > > Trond Myklebust
> > > > Linux NFS client maintainer, Hammerspace
> > > > trond.myklebust@xxxxxxxxxxxxxxx
> > > > 
> > > > 
> > 
> > --
> > Trond Myklebust
> > Linux NFS client maintainer, Hammerspace
> > trond.myklebust@xxxxxxxxxxxxxxx
> > 
> > 
Trond Myklebust
CTO, Hammerspace Inc
4300 El Camino Real, Suite 105
Los Altos, CA 94022
www.hammer.space

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx






[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux