On Wed, Jun 12, 2024 at 02:09:21PM +1000, NeilBrown wrote: > On Wed, 12 Jun 2024, Mike Snitzer wrote: > > On Wed, Jun 12, 2024 at 01:17:05PM +1000, NeilBrown wrote: > > > On Wed, 12 Jun 2024, Mike Snitzer wrote: > > > > > > > > SO I looked, and I'm saddened to see Neil's 6.8 commit 1e3577a4521e > > > > ("SUNRPC: discard sv_refcnt, and svc_get/svc_put"). > > > > > > > > [the lack of useful refcounting with the current code kind of blew me > > > > away.. but nice to see it existed not too long ago.] > > > > > > > > Rather than immediately invest the effort to revert commit > > > > 1e3577a4521e for my apparent needs... I'll send out v2 to allow for > > > > further review and discussion. > > > > > > > > But it really does feel like I _need_ svc_{get,put} and nfsd_{get,put} > > > > > > You are taking a reference, and at the right time. But it is to the > > > wrong thing. > > > > Well, that reference is to ensure nfsd (and nfsd_open_local_fh) is > > available for the duration of a local client connected to it. > > > > Really wasn't trying to keep nn->nfsd_serv around with this ;) > > > > > You call symbol_request(nfsd_open_local_fh) and so get a reference to > > > the nfsd module. But you really want a reference to the nfsd service. > > > > > > I would suggest that you use symbol_request() to get a function which > > > you then call and immediately symbol_put().... unless you need to use it > > > to discard the reference to the service later. > > > > Getting the nfsd_open_local_fh symbol once when client handshakes with > > server is meant to avoid needing to do so for every IO the client > > issues to the local server. > > > > > The function would take nfsd_mutex, check there is an nfsd_serv, sets a > > > flag or whatever to indicate the serv is being used for local_io, and > > > maybe returns the nfsd_serv. As long as that flag is set the serv > > > cannot be destroy. > > > > > > Do you need there to be available threads for LOCAL_IO to work? If so > > > the flag would cause setting the num threads to zero to fail. > > > If not .... that is weird. It would mean that setting the number of > > > threads to zero would not destroy the service and I don't think we want > > > to do that. > > > > > > So I think that when LOCAL_IO is in use, setting number of threads to > > > zero must return EBUSY or similar, even if you don't need the threads. > > > > Yes, but I really dislike needing to play games with a tangential > > characteristic of nfsd_serv (that threads are what hold reference), > > rather than have the ability to keep the nfsd_serv around in a cleaner > > way. > > > > This localio code doesn't run in nfsd context so it isn't using nfsd's > > threads. Forcing threads to be held in reserve because localio doesn't > > want nfsd_serv to go away isn't ideal. > > I started reading the rest of the patches and it seems that localio is > only used for READ, WRTE, COMMIT. Is that correct? Is there > documentation so that I don't have to ask? The header for v2's patch 7 (nfs/nfsd: add "localio" support) starts with: Add client support for bypassing NFS for localhost reads, writes, and commits. But I should've made it clearer by saying the same in the 0th header. > Obviously there are lots of other NFS requests so you wouldn't be able > to use localio without nfsd threads running.... That's very true. > But a normal remote client doesn't pin the nfsd threads or the > nfsd_serv. If the threads go away, the client blocks until the service > comes back. Would that be appropriate semantics for localio?? i.e. on > each nfsd_open_local_fh() call you mutex_trylock and hold that long > enough to get the 'struct file *'. If it fails because there is no > serv, you simply fall-back to the same path you use for other requests. > > Could that work? I can try it, but feels like it'd elevate nfsd_mutex to "contended", as such it feels heavy. > > Does it maybe make sense to introduce a more narrow svc_get/svc_put > > for this auxillary usecase? > > I don't think so. nfsd is a self-contained transactional service. It > doesn't promise to persist beyond each transaction. > Current transactions return status and/or data. Adding a new transaction > that returns a 'struct file *' fits that model reasonable well. Sure. But to be clear, I am adding global state to nfs_common that tracks nfsd_uuids. Those change every time a new nfsd_net is created for a given server (client will then lookup the uuid to see if local). But even if we went to the extreme where nfsd instances are bouncing like crazy, the 'nfsd_uuids' list in nfs_common should work fine. Just not seeing what is gained by nfsd being so ephemeral. Maybe your point is, it should work in that model too?.. I think it would, just less efficiently due to make-work to re-get resources it needs. > Taking an external reference to the nfs service is quite a big > conceptual change. Getting the nfsd_open_local_fh() symbol in a coarse-grained manner isn't about anything other than efficiency. Ensures localio client calls to nfsd_open_local_fh will work for as long as it exists on that local server -- nfs.ko's indirect reference to nfsd.ko (via nfs_localio.ko getting symbol for nfsd_open_local_fh) is dropped when client is destroyed. Mike