On Tue, Mar 18, 2025 at 12:57:17AM -0700, Christoph Hellwig wrote: > On Tue, Mar 18, 2025 at 03:27:48PM +1100, Dave Chinner wrote: > > Yes, NOWAIT may then add an incremental performance improvement on > > top for optimal layout cases, but I'm still not yet convinced that > > it is a generally applicable loop device optimisation that everyone > > wants to always enable due to the potential for 100% NOWAIT > > submission failure on any given loop device..... NOWAIT failure can be avoided actually: https://lore.kernel.org/linux-block/20250314021148.3081954-6-ming.lei@xxxxxxxxxx/ > > Yes, I think this is a really good first step: > > 1) switch loop to use a per-command work_item unconditionally, which also > has the nice effect that it cleans up the horrible mess of the > per-blkcg workers. (note that this is what the nvmet file backend has It could be worse to take per-command work, because IO handling crosses all system wq worker contexts. > always done with good result) per-command work does burn lots of CPU unnecessarily, it isn't good for use case of container, and it can not perform as well as NOWAIT. > 2) look into NOWAIT submission, especially for reads this should be > a clear winner and probaby done unconditionally. For writes it > might be a bit of a tradeoff if we expect the writes to allocate > a lot, so we might want some kind of tunable for it. It is a winner for over-write too. WRITE with allocation can be kept to submit from wq context, see my patchset V2. Thanks, Ming