On Fri, Aug 30, 2024 at 03:47:31AM -0400, Mike Snitzer wrote: > On Fri, Aug 30, 2024 at 12:20:13PM +1000, NeilBrown wrote: > > Following are revised versions of 6 patches from the v14 localio series. > > > > The issue addressed is net namespace refcounting. > > > > We don't want to keep a long-term counted reference in the client > > because that prevents a server container from completely shutting down. > > > > So we avoid taking a reference at all and rely on the per-cpu reference > > to the server being sufficient to keep the net-ns active. This involves > > allowing the net-ns exit code to iterate all active clients and clear > > their ->net pointers (which they need to find the per-cpu-refcount for > > the nfs_serv). > > > > So: > > - embed nfs_uuid_t in nfs_client. This provides a list_head that can > > be used to find the client. It does add the actual uuid to nfs_client > > so it is bigger than needed. If that is really a problem we can find > > a fix. > > > > - When the nfs server confirms that the uuid is local, it moves the > > nfs_uuid_t onto a per-net-ns list. > > > > - When the net-ns is shutting down - in a "pre_exit" handler, all these > > nfS_uuid_t have their ->net cleared. There is an rcu_synchronize() > > call between pre_exit() handlers and exit() handlers so and caller > > that sees ->net as not NULL can safely check the ->counter > > > > - We now pass the nfs_uuid_t to nfsd_open_local_fh() so it can safely > > look at ->net in a private rcu_read_lock() section. > > > > I have compile tested this code but nothing more. > > > > Thanks, > > NeilBrown > > > > [PATCH 14/25] nfs_common: add NFS LOCALIO auxiliary protocol > > [PATCH 15/25] nfs_common: introduce nfs_localio_ctx struct and > > [PATCH 16/25] nfsd: add localio support > > [PATCH 17/25] nfsd: implement server support for NFS_LOCALIO_PROGRAM > > [PATCH 19/25] nfs: add localio support > > [PATCH 23/25] nfs: implement client support for NFS_LOCALIO_PROGRAM > > Hey Neil, > > I attempted to test the kernel with your changes but it crashed with: > > [ 55.422564] list_add corruption. next is NULL. > [ 55.423523] ------------[ cut here ]------------ > [ 55.424423] kernel BUG at lib/list_debug.c:27! ... > I'll triple check my melding of your changes and mine in ~7 hours.. I > may have missed something. Good news, I was just missing your fs/nfsd/nfsctl.c changes.. works well with those ;) Seeing a very slight drop in performance (with my well-worn smoke test that does 20 secs of aio directio 128K reads with 24 threads on my testbed). But I mean slight (~2-3%). Not worried about it, correctness trumps performance, but I think reclaiming that performance will be easy (by avoiding the need for KMEM_CACHE nfs_localio_ctx_{alloc,free}). > Note this is _not_ with your other incremental patch (that uses > __module_get) -- only because I didn't get to that yet. Moving on to incorporating your nfsd __module_get now. Then on to the work of switching from nfs_localio_ctx back to returning nfsd_file. Mike