Hi, This patchset try to add new tagset map for WRR(weighted round robin), like what nvme spec dose. The first patch add three new HCTX_TYPE_WRR_LOW/MEDIUM/HIGH, and a attribute blkio.wrr to enable different blkcg use different tag map. If the hardware queue support different priority, the driver can map high priority hardware queue to HCTX_TYPE_WRR_HIGH. For high io priority block cgroup, we should change its blio.wrr to high, echo "major:minor high" > blkio.wrr, that its io can be servied more quickly. This is useful for different containers share same nvme device, for important containers use high prioriity hw queue, others use medium or low priority hw queue. The second patch, add a WRR for null_blk to simulate what nvme does, we add a new irqmode=3 to enable NULL_IRQ_WRR, a kernel thread will processing all haredware queue's io in WRR fasion. A simpel test result: we add 32 hw queue in total: defaut: 8 read: 0 poll: 0 wrr_low: 8 wrr_medium: 8 wrr_high: 8 insmod drivers/block/null_blk.ko irqmode=3 \ submit_queues=32 submit_queues_wrr_high=8 \ submit_queues_wrr_medium=8 submit_queues_wrr_low=8 fio --bs=4k --ioengine=libaio --iodepth=16 --filename=/dev/nullb0 --direct=1 --runtime=60 --numjobs=16 --rw=randread --name=test$1 --group_reporting --gtod_reduce=1 check hardware context type: cat /sys/kernel/debug/block/nullb0/hctx*/type default wrr_low wrr_low wrr_low wrr_low wrr_low wrr_low wrr_medium wrr_medium wrr_medium wrr_medium default wrr_medium wrr_medium wrr_medium wrr_medium wrr_high wrr_high wrr_high wrr_high wrr_high wrr_high default wrr_high wrr_high default default default default default wrr_low wrr_low 1. run 3 fio in default hw queue fio1: 130K fio2: 130K fio3: 130K 2. use different hw queue mkdir /sys/fs/cgroup/blkio/{low,medium, high} echo "251:0 high" > /sys/fs/cgroup/blkio/high/blkio.wrr echo "251:0 medium" > /sys/fs/cgroup/blkio/medium/blkio.wrr echo "251:0 low" > /sys/fs/cgroup/blkio/low/blkio.wrr echo `pidof fio1` > /sys/fs/cgroup/blkio/high/cgroup.procs echo `pidof fio2` > /sys/fs/cgroup/blkio/medium/cgroup.procs echo `pidof fio3` > /sys/fs/cgroup/blkio/low/cgroup.procs fio1: 260K fio2: 160K fio3: 60K we dispatch 8(high), 4(medium), 1(low) ios at time, to simulate weighted round robin policy. Weiping Zhang (2): block: add weighted round robin for blkcgroup null_blk: add support weighted round robin submition queue block/blk-cgroup.c | 88 +++++++++++++ block/blk-mq-debugfs.c | 3 + block/blk-mq-sched.c | 6 +- block/blk-mq-tag.c | 4 +- block/blk-mq-tag.h | 2 +- block/blk-mq.c | 12 +- block/blk-mq.h | 17 ++- block/blk.h | 2 +- drivers/block/null_blk.h | 7 + drivers/block/null_blk_main.c | 294 ++++++++++++++++++++++++++++++++++++++++-- include/linux/blk-cgroup.h | 2 + include/linux/blk-mq.h | 12 ++ 12 files changed, 426 insertions(+), 23 deletions(-) -- 2.14.1