Hi,
On 2020/12/27 19:58, Ming Lei wrote:
Hi Yu Kuai,
On Sat, Dec 26, 2020 at 06:28:06PM +0800, Yu Kuai wrote:
When sharing a tag set, if most disks are issuing small amount of IO, and
only a few is issuing a large amount of IO. Current approach is to limit
the max amount of tags a disk can get equally to the average of total
tags. Thus the few heavy load disk can't get enough tags while many tags
are still free in the tag set.
Yeah, current approach just allocates same share for each active queue
which is evaluated in each timeout period.
That said you are trying to improve the following case:
- heavy IO on one or several disks, and the average share for these
disks become bottleneck of IO performance
- small amount IO on other disks attached to the same host, and all IOs are
submitted to disk in <30 second period.
Just wondering if you may share the workload you are trying to optimize,
or it is just one improvement in theory? And what is the disk(hdd, ssd
or nvme) and host? And how many disks in your setting? And how deep the tagset
depth is?
The details of the environment that we found the problem are as follows:
total driver tags: 128
number of disks: 13 (network drive, and they form a dm-multipath)
default queue_depth: 32
disk performance: when test with 4k randread and single thread, iops is
300. And can up to 4000 with 32 thread.
test cmd: fio -ioengine=psync -numjobs=32 ...
We found that mpath will issue sg_io periodically(about 15s),which lead
to active_queues setting to 13 for about 5s in every 15s.
By the way, I'm not sure this is a common scenario, however, sq don't
have such problem,
Thanks
Yu Kuai