On Tue, 2023-07-11 at 22:17 +1000, NeilBrown wrote: > On Tue, 11 Jul 2023, Jeff Layton wrote: > > On Tue, 2023-07-11 at 19:54 +1000, NeilBrown wrote: > > > On Tue, 11 Jul 2023, Chuck Lever III wrote: > > > > > > > > > On Jul 10, 2023, at 6:29 PM, NeilBrown <neilb@xxxxxxx> wrote: > > > > > > > > > > On Tue, 11 Jul 2023, Chuck Lever wrote: > > > > > > From: Chuck Lever <chuck.lever@xxxxxxxxxx> > > > > > > > > > > > > Measure a source of thread scheduling inefficiency -- count threads > > > > > > that were awoken but found that the transport queue had already been > > > > > > emptied. > > > > > > > > > > > > An empty transport queue is possible when threads that run between > > > > > > the wake_up_process() call and the woken thread returning from the > > > > > > scheduler have pulled all remaining work off the transport queue > > > > > > using the first svc_xprt_dequeue() in svc_get_next_xprt(). > > > > > > > > > > I'm in two minds about this. The data being gathered here is > > > > > potentially useful > > > > > > > > It's actually pretty shocking: I've measured more than > > > > 15% of thread wake-ups find no work to do. > > > > > > That is a bigger number than I would have guessed! > > > > > > > I'm guessing the idea is that the receiver is waking a thread to do the > > work, and that races with one that's already running? I'm sure there are > > ways we can fix that, but it really does seem like we're trying to > > reinvent workqueues here. > > True. But then workqueues seem to reinvent themselves every so often > too. Once gets the impression they are trying to meet an enormous > variety of needs. > I'm not against trying to see if nfsd could work well in a workqueue > environment, but I'm not certain it is a good idea. Maintaining control > of our own thread pools might be safer. > > I have a vague memory of looking into this in more detail once and > deciding that it wasn't a good fit, but I cannot recall or easily deduce > the reason. Obviously we would have to give up SIGKILL, but we want to > do that anyway. > > Would we want unbound work queues - so they can be scheduled across > different CPUs? Are NFSD threads cpu-intensive or not? I'm not sure. > > I would be happy to explore a credible attempt at a conversion. > I had some patches several years ago that did a conversion from nfsd threads to workqueues. It worked reasonably well, but under heavy loads it didn't perform as well as having a dedicated threadpool...which is not too surprising, really. nfsd has been tuned for performance over years and it's fairly "greedy" about squatting on system resources even when it's idle. If we wanted to look again at doing this with workqueues, we'd need the workqueue infrastructure to allow for long-running jobs that may block for a long time. That means it might need to be more cavalier about spinning up new workqueue threads and keeping them running when there are a lot of concurrent, but sleeping workqueue jobs. We probably would want unbound workqueues so we can have more jobs in flight at any given time than cpus, given that a lot of them will end up being blocked in some way or another. CPU utilization is a good question. Mostly we just call down into the filesystem to do things, and the encoding and decoding is not _that_ cpu intensive. For most storage stacks, I suspect we don't use a lot of CPU aside from normal copying of data. There might be some exceptions though, like when the underlying storage is using encryption, etc. The main upside to switching to workqueues is that it would allow us to get rid of a lot of fiddly, hand-tuned thread handling code. It might also make it simpler to convert to a more asynchronous model. For instance, it would be nice to be able to not have to put a thread to sleep while waiting on writeback for a COMMIT. I think that'd be easier to handle with a workqueue-like model. -- Jeff Layton <jlayton@xxxxxxxxxx>