Hi,
在 2023/10/22 0:21, Bart Van Assche 写道:
On 10/21/23 00:32, Yu Kuai wrote:
Sorry for such huge delay, I was struggled on implementing a smoothly
algorithm to borrow tags and return borrowed tags, and later I put this
on ice and focus on other stuff.
I had an idea to implement a state machine, however, the amount of code
was aggressive and I gave up. And later, I implemented a simple version,
and I tested it in your case, 32 tags and 2 shared node, result looks
good(see below), however, I'm not confident this can work well general.
Anyway, I'll send a new RFC verion for this, and please let me know if
you still think this approch is unacceptable.
Thanks,
Kuai
Test script:
[global]
ioengine=libaio
iodepth=2
bs=4k
direct=1
rw=randrw
group_reporting
[sda]
numjobs=32
filename=/dev/sda
[sdb]
numjobs=1
filename=/dev/sdb
Test result, by monitor new debugfs entry shared_tag_info:
time active available
sda sdb sda sdb
0 0 0 32 32
1 16 2 16 16 -> start fair sharing
2 19 2 20 16
3 24 2 24 16
4 26 2 28 16 -> borrow 32/8=4 tags each round
5 28 2 28 16 -> save at lease 4 tags for sdb
Hi Yu,
Thank you for having shared these results. What is the unit of the
numbers in the time column?
I added a timer in blk_mq_tags, and timer function is used to implement
borrow tags and return borrowed tags, the timer will start when one node
is busy, and will expire in HZ, so the time means each second.
In the above I see that more tags are assigned to sda than to sdb
although I/O is being submitted to both LUNs. I think the current
algoritm defines fairness as dividing tags in a fair way across active
LUNs. Do the above results show that tags are divided per active job
instead of per active LUN? If so, I'm not sure that everyone will agree
that this is a fair way to distribute tags ...
Yes, active tag is divided into per active LUN, specifically each
request_queue or hctx that is sharing tags.
Thanks,
Kuai
Thanks,
Bart.
.