Re: [linux-next:master] [block/mq] 574e7779cf: fio.write_iops -72.9% regression

Jens Axboe <axboe@xxxxxxxxx> · Fri, 9 Feb 2024 14:06:03 -0700

On 1/31/24 8:42 AM, kernel test robot wrote:
> 
> 
> Hello,
> 
> kernel test robot noticed a -72.9% regression of fio.write_iops on:
> 
> 
> commit: 574e7779cf583171acb5bf6365047bb0941b387c ("block/mq-deadline: use separate insertion lists")
> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
> 
> testcase: fio-basic
> test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
> parameters:
> 
> 	runtime: 300s
> 	disk: 1HDD
> 	fs: xfs
> 	nr_task: 100%
> 	test_size: 128G
> 	rw: write
> 	bs: 4k
> 	ioengine: io_uring
> 	direct: direct
> 	cpufreq_governor: performance

I looked into this, and I think I see what is happening. We do still do
insertion merges, but it's now postponed to dispatch time. This means
that for this crazy case, where you have 64 threads doing sequential
writes, we run out of tags (which is 64 by default) and hence dispatch
sooner than we would've before. Before, we would've queued one request,
then allocated a new one, and queued that. When that queue event
happened, we would merge with the previous - either upfront, or when the
request is inserted. In any case, we now have 1 bigger request, rather
than two smaller ones that still need merging.

This leaves more requests free.

I think we can solve this by doing smarter merging at insertion time.
I've dropped the series from my for-next branch for now, will need
revisiting and then I'll post it again.

-- 
Jens Axboe