Re: [PATCH v14-plus 00/25] Address netns refcount issues for localio

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Aug 30, 2024 at 03:47:31AM -0400, Mike Snitzer wrote:
> On Fri, Aug 30, 2024 at 12:20:13PM +1000, NeilBrown wrote:
> > Following are revised versions of 6 patches from the v14 localio series.
> > 
> > The issue addressed is net namespace refcounting.
> > 
> > We don't want to keep a long-term counted reference in the client
> > because that prevents a server container from completely shutting down.
> > 
> > So we avoid taking a reference at all and rely on the per-cpu reference
> > to the server being sufficient to keep the net-ns active.  This involves
> > allowing the net-ns exit code to iterate all active clients and clear
> > their ->net pointers (which they need to find the per-cpu-refcount for
> > the nfs_serv).
> > 
> > So:
> >  - embed nfs_uuid_t in nfs_client.  This provides a list_head that can
> >    be used to find the client.  It does add the actual uuid to nfs_client
> >    so it is bigger than needed.  If that is really a problem we can find
> >    a fix.
> > 
> >  - When the nfs server confirms that the uuid is local, it moves the
> >    nfs_uuid_t onto a per-net-ns list.
> > 
> >  - When the net-ns is shutting down - in a "pre_exit" handler, all these
> >    nfS_uuid_t have their ->net cleared.  There is an rcu_synchronize()
> >    call between pre_exit() handlers and exit() handlers so and caller
> >    that sees ->net as not NULL can safely check the ->counter
> > 
> >  - We now pass the nfs_uuid_t to nfsd_open_local_fh() so it can safely
> >    look at ->net in a private rcu_read_lock() section.
> > 
> > I have compile tested this code but nothing more.
> > 
> > Thanks,
> > NeilBrown
> > 
> >  [PATCH 14/25] nfs_common: add NFS LOCALIO auxiliary protocol
> >  [PATCH 15/25] nfs_common: introduce nfs_localio_ctx struct and
> >  [PATCH 16/25] nfsd: add localio support
> >  [PATCH 17/25] nfsd: implement server support for NFS_LOCALIO_PROGRAM
> >  [PATCH 19/25] nfs: add localio support
> >  [PATCH 23/25] nfs: implement client support for NFS_LOCALIO_PROGRAM
> 
> Hey Neil,
> 
> I attempted to test the kernel with your changes but it crashed with:
> 
> [   55.422564] list_add corruption. next is NULL.
> [   55.423523] ------------[ cut here ]------------
> [   55.424423] kernel BUG at lib/list_debug.c:27!
...
> I'll triple check my melding of your changes and mine in ~7 hours.. I
> may have missed something.

Good news, I was just missing your fs/nfsd/nfsctl.c changes.. works
well with those ;)

Seeing a very slight drop in performance (with my well-worn smoke test
that does 20 secs of aio directio 128K reads with 24 threads on my
testbed).  But I mean slight (~2-3%).  Not worried about it,
correctness trumps performance, but I think reclaiming that
performance will be easy (by avoiding the need for KMEM_CACHE
nfs_localio_ctx_{alloc,free}).

> Note this is _not_ with your other incremental patch (that uses
> __module_get) -- only because I didn't get to that yet.

Moving on to incorporating your nfsd __module_get now.  Then on to the
work of switching from nfs_localio_ctx back to returning nfsd_file.

Mike




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux