On 1/16/24 9:54 AM, Jens Axboe wrote: > Results in patch 2, but tldr is a more than 9% improvement (108M -> 118M > IOPS) for my test case, which doesn't even enable most of the costly > block layer items that you'd typically find in a distro and which would > further increase the number of issue side time calls. This brings iostats > enabled _almost_ to the level of turning it off. Enabled the typical distro things (block cgroups, blk-wbt, iocost, iolatency) which all add considerable cost (and is an optimization project in itself) and this is the performance of the stock kernel with iostats enabled: IOPS=91.01M, BW=44.44GiB/s, IOS/call=32/32 IOPS=91.29M, BW=44.58GiB/s, IOS/call=31/32 IOPS=91.27M, BW=44.57GiB/s, IOS/call=32/31 IOPS=91.26M, BW=44.56GiB/s, IOS/call=32/31 IOPS=91.38M, BW=44.62GiB/s, IOS/call=32/31 IOPS=91.28M, BW=44.57GiB/s, IOS/call=32/32 which is down from 122M for an optimized config and with iostats off. With this patchset applied (and one extra patch, missed a spot...), we now get: IOPS=101.38M, BW=49.50GiB/s, IOS/call=32/32 IOPS=101.31M, BW=49.47GiB/s, IOS/call=32/32 IOPS=101.35M, BW=49.49GiB/s, IOS/call=31/31 IOPS=101.44M, BW=49.53GiB/s, IOS/call=32/31 IOPS=101.32M, BW=49.47GiB/s, IOS/call=32/32 IOPS=101.14M, BW=49.38GiB/s, IOS/call=32/31 which is about a 10% improvement. Mostly ran this because I was curious, and while the above config changes do add more time stamping, it also adds additional overhead. In any case, 10% win for the distro config case is not bad at all. -- Jens Axboe