Hi Leah, On Thu, Oct 03, 2024 at 02:13:52PM -0700, Leah Rumancik wrote: > Hello, > > I have been investigating a read performance regression of dm-snapshot > on top of loopback in which the read time for a dd command increased > from 2min to 40min. I bisected the issue to dc5fc361d89 ("block: > attempt direct issue of plug list"). I blktraced before and after this > commit and the main difference I saw was that before this commit, when > the performance was good, there were a lot of IO unplugs on the loop > dev. After this commit, I saw 0 IO unplugs. /me makes a not that it might make sense to enhance the tracing to show which of the trace_block_unplug call sites did a particular unplug because that might be helpful here, but I suspect the ones you saw the ones from blk_mq_dispatch_plug_list, which now gets bypassed. I'm not really sure how that changed things, except that I know in old kernels we had issues with reordering I/O in this path, and maybe your workload hit that? Did the issue order change in the blktrace? > On the mainline, I was also able to bisect to a commit which fixed > this issue: 667ea36378cf ("loop: don't set QUEUE_FLAG_NOMERGES"). I > also blktraced before and after this commit, and unsurprisingly, the > main difference was that commit resulted in IO merges whereas > previously there were none being. With QUEUE_FLAG_NOMERGES even very basic merging is enabled, so that would fix any smaller amount of reordering. It is in fact a bit stupid that this ever got set by default on the loop driver. > 6.6.y and 6.1.y and were both experiencing the performance issue. I > tried porting 667ea36378 to these branches; it applied cleanly and > resolved the issue for both. So perhaps we should consider it for the > stable trees, but it'd be great if someone from the block layer could > chime in with a better idea of what's going on here. I'm totally fine with backporting the patch.