On Thu, Mar 20, 2025 at 12:08:19AM -0700, Christoph Hellwig wrote: > On Tue, Mar 18, 2025 at 05:34:28PM +0800, Ming Lei wrote: > > On Tue, Mar 18, 2025 at 12:57:17AM -0700, Christoph Hellwig wrote: > > > On Tue, Mar 18, 2025 at 03:27:48PM +1100, Dave Chinner wrote: > > > > Yes, NOWAIT may then add an incremental performance improvement on > > > > top for optimal layout cases, but I'm still not yet convinced that > > > > it is a generally applicable loop device optimisation that everyone > > > > wants to always enable due to the potential for 100% NOWAIT > > > > submission failure on any given loop device..... > > > > NOWAIT failure can be avoided actually: > > > > https://lore.kernel.org/linux-block/20250314021148.3081954-6-ming.lei@xxxxxxxxxx/ > > That's a very complex set of heuristics which doesn't match up > with other uses of it. I'd suggest you to point them out in the patch review. > > > > > > > > > Yes, I think this is a really good first step: > > > > > > 1) switch loop to use a per-command work_item unconditionally, which also > > > has the nice effect that it cleans up the horrible mess of the > > > per-blkcg workers. (note that this is what the nvmet file backend has > > > > It could be worse to take per-command work, because IO handling crosses > > all system wq worker contexts. > > So do other workloads with pretty good success. > > > > > > always done with good result) > > > > per-command work does burn lots of CPU unnecessarily, it isn't good for > > use case of container > > That does not match my observations in say nvmet. But if you have > numbers please share them. Please see the result I posted: https://lore.kernel.org/linux-block/Z9FFTiuMC8WD6qMH@fedora/ Thanks, Ming