Re: [PATCH/RFC] NFSv3 handle out-of-order write replies

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2023-01-27 at 07:36 +1100, NeilBrown wrote:
> On Thu, 26 Jan 2023, Jeff Layton wrote:
> > On Wed, 2023-01-25 at 13:03 +1100, NeilBrown wrote:
> > 
> > > diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
> > > index 7931fa472561..8a83d6d204ed 100644
> > > --- a/include/linux/nfs_fs.h
> > > +++ b/include/linux/nfs_fs.h
> > > @@ -190,6 +190,33 @@ struct nfs_inode {
> > >  	/* Open contexts for shared mmap writes */
> > >  	struct list_head	open_files;
> > >  
> > > +	/* Keep track of out-of-order replies.
> > > +	 * The ooo array contains start/end pairs of
> > > +	 * number from the changeid sequence when
> > > +	 * the inodes iversion has been updated.
> > > +	 * It also contains end/start pair (i.e. reverse order)
> > > +	 * of sections of the changeid sequence that have
> > > +	 * been seen in replies from the server.
> > > +	 * Normally these should match and when both
> > > +	 * A:B and B:A are found in ooo, they are both removed.
> > > +	 * And if a reply with A:B causes an iversion update
> > > +	 * of A:B, then neither are added.
> > > +	 * When a reply has pre_change that doesn't match
> > > +	 * iversion, then the changeid pair, and any consequent
> > > +	 * change in iversion ARE added.  Later replies
> > > +	 * might fill in the gaps, or possibly a gap is caused
> > > +	 * by a change from another client.
> > > +	 * When a file or directory is opened, if the ooo table
> > > +	 * is not empty, then we assume the gaps were due to
> > > +	 * another client and we invalidate the cached data.
> > > +	 *
> > > +	 * We can only track a limited number of concurrent gaps.
> > > +	 * Currently that limit is 16.
> > > +	 */
> > > +	int ooo_cnt;
> > > +	int ooo_max; // TRACING
> > > +	unsigned long ooo[16][2];
> > 
> > Why unsigned longs here? Shouldn't these be u64?
> 
> Yes, they should be u64.  Thanks.
> 
> > 
> > I guess you could argue that when we have 32-bit longs that the most
> > significant bits don't matter, but that's also the case with 64-bit
> > longs, and 32 bits would halve the space requirements.
> > 
> > Also, this grows each inode by 2k on a 64-bit arch! Maybe we should
> > dynamically allocate these things instead? If the allocation fails, then
> > we could just go back to marking the cache invalid and move on.
> 
> 2K? 8*16*2+4*2 == 264, not 2048.
> 

You're correct of course. I'm not sure where I got 2048.

> But I agree that allocating on demand would make sense.  16 is probably
> more than needed.  I don't have proper testing results from the customer
> yet, but I wouldn't be surprised if a smaller number would suffice.  But
> if we are allocating only on demand, it wouldn't hurt to allocate 16.

Makes sense. Is there a max number of NFS writes you can have in flight
to a single server in the client? That might inform how large an array
you need.
> 
-- 
Jeff Layton <jlayton@xxxxxxxxxx>




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux