Re: [PATCH v5 3/5] nfsd: rework refcounting in filecache

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 02 Nov 2022, Jeff Layton wrote:
> On Wed, 2022-11-02 at 08:23 +1100, NeilBrown wrote:
> > On Wed, 02 Nov 2022, Jeff Layton wrote:
> > > The filecache refcounting is a bit non-standard for something searchable
> > > by RCU, in that we maintain a sentinel reference while it's hashed. This
> > > in turn requires that we have to do things differently in the "put"
> > > depending on whether its hashed, which we believe to have led to races.
> > > 
> > > There are other problems in here too. nfsd_file_close_inode_sync can end
> > > up freeing an nfsd_file while there are still outstanding references to
> > > it, and there are a number of subtle ToC/ToU races.
> > > 
> > > Rework the code so that the refcount is what drives the lifecycle. When
> > > the refcount goes to zero, then unhash and rcu free the object.
> > > 
> > > With this change, the LRU carries a reference. Take special care to
> > > deal with it when removing an entry from the list.
> > 
> > The refcounting and lru management all look sane here.
> > 
> > You need to have moved the final put (and corresponding fsync) to
> > different threads.  I think I see you and Chuck discussing that and I
> > have no sense of what is "right". 
> > 
> 
> Yeah, this is a tough call. I get Chuck's reticence.
> 
> One thing we could consider is offloading the SYNC_NONE writeback
> submission to a workqueue. I'm not sure though whether that's a win --
> it might just add needless context switches. OTOH, that would make it
> fairly simple to kick off writeback when the REFERENCED flag is cleared,
> which would probably be the best time to do it.
> 
> An entry that ends up being harvested by the LRU scanner is going to be
> touched by it at least twice: once to clear the REFERENCED flag, and
> again ~2s later to reap it.
> 
> If we schedule writeback when we clear the flag then we have a pretty
> good indication that nothing else is going to be using it (though I
> think we need to clear REFERENCED even when nfsd_file_check_writeback
> returns true -- I'll fix that in the coming series).
> 
> In any case, I'd probably like to do that sort of change in a separate
> series after we get the first part sorted.
> 
> > But it would be nice to explain in
> > the comment what is being moved and why, so I could then confirm that
> > the code matches the intent.
> > 
> 
> I'm happy to add comments, but I'm a little unclear on what you're
> confused by here. It's a bit too big of a patch for me to give a full
> play-by-play description. Can you elaborate on what you'd like to see?
> 

I don't need blow-by-blow, but all the behavioural changes should at
least be flagged in the intro, and possibly explained.
The one I particularly noticed is in nfsd_file_close_inode() which
previously used nfsd_file_dispose_list() which hands the final close off
to nfsd_filecache_wq.
But this patch now does the final close in-line so an fsnotify event
might now do the fsync.  I was assuming that was deliberate and wanted
it to be explained.  But maybe it wasn't deliberate?

The movement of flush_delayed_fput() threw me at first, but I think I
understand it now - the new code for close_inode_sync is much cleaner,
not needing dispose_list_sync.

Thanks,
NeilBrown





[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux