Hi, We are observing an fio hang with the latest v5.10.78 Linux kernel version with RNBD. The setup is as follows, On the server side, 16 nullblk devices. On the client side, map those 16 block devices through RNBD-RTRS. Change the scheduler for those RNBD block devices to mq-deadline. Run fios with the following configuration. [global] description=Emulation of Storage Server Access Pattern bssplit=512/20:1k/16:2k/9:4k/12:8k/19:16k/10:32k/8:64k/4 fadvise_hint=0 rw=randrw:2 direct=1 random_distribution=zipf:1.2 time_based=1 runtime=60 ramp_time=1 ioengine=libaio iodepth=128 iodepth_batch_submit=128 iodepth_batch_complete=128 numjobs=1 group_reporting [job1] filename=/dev/rnbd0 [job2] filename=/dev/rnbd1 [job3] filename=/dev/rnbd2 [job4] filename=/dev/rnbd3 [job5] filename=/dev/rnbd4 [job6] filename=/dev/rnbd5 [job7] filename=/dev/rnbd6 [job8] filename=/dev/rnbd7 [job9] filename=/dev/rnbd8 [job10] filename=/dev/rnbd9 [job11] filename=/dev/rnbd10 [job12] filename=/dev/rnbd11 [job13] filename=/dev/rnbd12 [job14] filename=/dev/rnbd13 [job15] filename=/dev/rnbd14 [job16] filename=/dev/rnbd15 Some of the fio threads hangs and the fio never finishes. fio fio.ini job1: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) 512B-64.0KiB, ioengine=libaio, iodepth=128 job2: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) 512B-64.0KiB, ioengine=libaio, iodepth=128 job3: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) 512B-64.0KiB, ioengine=libaio, iodepth=128 job4: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) 512B-64.0KiB, ioengine=libaio, iodepth=128 job5: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) 512B-64.0KiB, ioengine=libaio, iodepth=128 job6: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) 512B-64.0KiB, ioengine=libaio, iodepth=128 job7: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) 512B-64.0KiB, ioengine=libaio, iodepth=128 job8: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) 512B-64.0KiB, ioengine=libaio, iodepth=128 job9: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) 512B-64.0KiB, ioengine=libaio, iodepth=128 job10: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) 512B-64.0KiB, ioengine=libaio, iodepth=128 job11: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) 512B-64.0KiB, ioengine=libaio, iodepth=128 job12: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) 512B-64.0KiB, ioengine=libaio, iodepth=128 job13: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) 512B-64.0KiB, ioengine=libaio, iodepth=128 job14: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) 512B-64.0KiB, ioengine=libaio, iodepth=128 job15: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) 512B-64.0KiB, ioengine=libaio, iodepth=128 job16: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) 512B-64.0KiB, ioengine=libaio, iodepth=128 fio-3.12 Starting 16 processes Jobs: 16 (f=12): [m(3),/(2),m(5),/(1),m(1),/(1),m(3)][0.0%][r=130MiB/s,w=130MiB/s][r=14.7k,w=14.7k IOPS][eta 04d:07h:4 Jobs: 15 (f=11): [m(3),/(2),m(5),/(1),_(1),/(1),m(3)][51.2%][r=7395KiB/s,w=6481KiB/s][r=770,w=766 IOPS][eta 01m:01s] Jobs: 15 (f=11): [m(3),/(2),m(5),/(1),_(1),/(1),m(3)][52.7%][eta 01m:01s] We checked the block devices, and there are requests waiting in their fifo (not on all devices, just few whose corresponding fio threads are hung). $ cat /sys/kernel/debug/block/rnbd0/sched/read_fifo_list 00000000ce398aec {.op=READ, .cmd_flags=, .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, .internal_tag=209} 000000005ec82450 {.op=READ, .cmd_flags=, .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, .internal_tag=210} $ cat /sys/kernel/debug/block/rnbd0/sched/write_fifo_list 000000000c1557f5 {.op=WRITE, .cmd_flags=SYNC|IDLE, .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, .internal_tag=195} 00000000fc6bfd98 {.op=WRITE, .cmd_flags=SYNC|IDLE, .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, .internal_tag=199} 000000009ef7c802 {.op=WRITE, .cmd_flags=SYNC|IDLE, .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, .internal_tag=217} Potential points which fixes the hang 1) Using no scheduler (none) on the client side RNBD block devices results in no hang. 2) In the fio config, changing the line "iodepth_batch_complete=128" to the following fixes the hang, iodepth_batch_complete_min=1 iodepth_batch_complete_max=128 OR, iodepth_batch_complete=0 3) We also tracked down the version from which the hang started. The hang started with v5.10.50, and the following commit was one which results in the hang commit 512106ae2355813a5eb84e8dc908628d52856890 Author: Ming Lei <ming.lei@xxxxxxxxxx> Date: Fri Jun 25 10:02:48 2021 +0800 blk-mq: update hctx->dispatch_busy in case of real scheduler [ Upstream commit cb9516be7708a2a18ec0a19fe3a225b5b3bc92c7 ] Commit 6e6fcbc27e77 ("blk-mq: support batching dispatch in case of io") starts to support io batching submission by using hctx->dispatch_busy. However, blk_mq_update_dispatch_busy() isn't changed to update hctx->dispatch_busy in that commit, so fix the issue by updating hctx->dispatch_busy in case of real scheduler. Reported-by: Jan Kara <jack@xxxxxxx> Reviewed-by: Jan Kara <jack@xxxxxxx> Fixes: 6e6fcbc27e77 ("blk-mq: support batching dispatch in case of io") Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx> Link: https://lore.kernel.org/r/20210625020248.1630497-1-ming.lei@xxxxxxxxxx Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx> diff --git a/block/blk-mq.c b/block/blk-mq.c index 00d6ed2fe812..a368eb6dc647 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1242,9 +1242,6 @@ static void blk_mq_update_dispatch_busy(struct blk_mq_hw_ctx *hctx, bool busy) { unsigned int ewma; - if (hctx->queue->elevator) - return; - ewma = hctx->dispatch_busy; if (!ewma && !busy) We reverted the commit and tested and there is no hang. 4) Lastly, we tested newer version like 5.13, and there is NO hang in that also. Hence, probably some other change fixed it. Regards -Haris