Re: still seeing single client NFS4ERR_DELAY / CB_RECALL

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Aug 24, 2020 at 09:39:31AM -0400, Chuck Lever wrote:
> 
> 
> > On Aug 19, 2020, at 5:29 PM, Bruce Fields <bfields@xxxxxxxxxxxx> wrote:
> > 
> > On Tue, Aug 18, 2020 at 05:26:26PM -0400, Chuck Lever wrote:
> >> 
> >>> On Aug 17, 2020, at 6:20 PM, Bruce Fields <bfields@xxxxxxxxxxxx> wrote:
> >>> 
> >>> On Sun, Aug 16, 2020 at 04:46:00PM -0400, Chuck Lever wrote:
> >>> 
> >>>> In order of application:
> >>>> 
> >>>> 5920afa3c85f ("nfsd: hook nfsd_commit up to the nfsd_file cache")
> >>>> 961.68user 5252.40system 20:12.30elapsed 512%CPU, 2541 DELAY errors
> >>>> These results are similar to v5.3.
> >>>> 
> >>>> fd4f83fd7dfb ("nfsd: convert nfs4_file->fi_fds array to use nfsd_files")
> >>>> Does not build
> >>>> 
> >>>> eb82dd393744 ("nfsd: convert fi_deleg_file and ls_file fields to nfsd_file")
> >>>> 966.92user 5425.47system 33:52.79elapsed 314%CPU, 1330 DELAY errors
> >>>> 
> >>>> Can you take a look and see if there's anything obvious?
> >>> 
> >>> Unfortunately nothing about the file cache code is very obvious to me.
> >>> I'm looking at it....
> >>> 
> >>> It adds some new nfserr_jukebox returns in nfsd_file_acquire.  Those
> >>> mostly look like kmalloc failures, the one I'm not sure about is the
> >>> NFSD_FILE_HASHED check.
> >>> 
> >>> Or maybe it's the lease break there.
> >> 
> >> nfsd_file_acquire() always calls fh_verify() before it invokes nfsd_open().
> >> Replacing nfs4_get_vfs_file's nfsd_open() call with nfsd_file_acquire() adds
> >> almost 10 million fh_verify() calls to my test run.
> > 
> > Checking out the code as of fd4f83fd7dfb....
> > 
> > nfsd_file_acquire() calls nfsd_open_verified().
> > 
> > And nfsd_open() is basically just fh_verify()+nfsd_open_verified().
> > 
> > So it doesn't look like the replacement of nfsd_open() by
> > nfsd_file_acquire() should have changed the number of fh_verify() calls.
> 
> I see a lot more vfs_setlease() failures after fd4f83fd7dfb.
> check_conflicting_open() fails because "inode is open for write":
> 
> 1780         if (arg == F_RDLCK)
> 1781                 return inode_is_open_for_write(inode) ? -EAGAIN : 0;
> 
> The behavior on the wire is that the server simply doesn't hand out
> many delegations.
> 
> NFSv4 OPEN uses nfsd_file_acquire() now, but I don't see CLOSE
> releasing the cached file descriptor. Wouldn't that cached
> descriptor conflict with subsequent OPENs?

Could be, yes.

That also reminds me of this patch, did I already send it to you?

--b.

commit 055e7b94ef32
Author: J. Bruce Fields <bfields@xxxxxxxxxx>
Date:   Fri Jul 17 18:54:54 2020 -0400

    nfsd: Cache R, RW, and W opens separately
    
    The nfsd open code has always kept separate read-only, read-write, and
    write-only opens as necessary to ensure that when a client closes or
    downgrades, we don't retain more access than necessary.
    
    Honestly, I'm not sure if that's completely necessary, but I'd rather
    stick to that behavior.
    
    Signed-off-by: J. Bruce Fields <bfields@xxxxxxxxxx>

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 82198d747c4c..4b6f70e0d987 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -891,7 +891,7 @@ nfsd_file_find_locked(struct inode *inode, unsigned int may_flags,
 
 	hlist_for_each_entry_rcu(nf, &nfsd_file_hashtbl[hashval].nfb_head,
 				 nf_node, lockdep_is_held(&nfsd_file_hashtbl[hashval].nfb_lock)) {
-		if ((need & nf->nf_may) != need)
+		if (nf->nf_may != need)
 			continue;
 		if (nf->nf_inode != inode)
 			continue;



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux