Hello Jens, If I leave the srp-test software running for a few minutes using the following command: # while ~bart/software/infiniband/srp-test/run_tests -d -r 30; do :; done then after some time the following complaint appears for multiple kworkers: INFO: task kworker/9:0:65 blocked for more than 480 seconds. Tainted: G I 4.11.0-rc4-dbg+ #5 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kworker/9:0 D 0 65 2 0x00000000 Workqueue: dio/dm-0 dio_aio_complete_work Call Trace: __schedule+0x3df/0xc10 schedule+0x38/0x90 rwsem_down_write_failed+0x2c4/0x4c0 call_rwsem_down_write_failed+0x17/0x30 down_write+0x5a/0x70 __generic_file_fsync+0x43/0x90 ext4_sync_file+0x2d0/0x550 vfs_fsync_range+0x46/0xa0 dio_complete+0x181/0x1b0 dio_aio_complete_work+0x17/0x20 process_one_work+0x208/0x6a0 worker_thread+0x49/0x4a0 kthread+0x107/0x140 ret_from_fork+0x2e/0x40 I had not yet observed this behavior with kernel v4.10 or older. If this happens and I check the queue state with the following script: #!/bin/bash cd /sys/class/block || exit $? for dev in *; do if [ -e "$dev/mq" ]; then echo "$dev" for f in "$dev"/mq/*/{pending,*/rq_list}; do [ -e "$f" ] || continue if { read -r line1 && read -r line2; } <"$f"; then echo "$f" echo "$line1 $line2" >/dev/null head -n 9 "$f" fi done ( cd /sys/kernel/debug/block >&/dev/null && for d in "$dev"/mq/*; do [ ! -d "$d" ] && continue grep -q '^busy=0$' "$d/tags" && continue for f in "$d"/{dispatch,tags*,cpu*/rq_list}; do [ -e "$f" ] && grep -aH '' "$f" done done ) fi done then the following output appears: sda sdb sdc sdd sdd/mq/3/dispatch:ffff880401655d00 {.cmd_flags=0xca01, .rq_flags=0x2040, .tag=59, .internal_tag=-1} sdd/mq/3/tags:nr_tags=62 sdd/mq/3/tags:nr_reserved_tags=0 sdd/mq/3/tags:active_queues=0 sdd/mq/3/tags: sdd/mq/3/tags:bitmap_tags: sdd/mq/3/tags:depth=62 sdd/mq/3/tags:busy=31 sdd/mq/3/tags:bits_per_word=8 sdd/mq/3/tags:map_nr=8 sdd/mq/3/tags:alloc_hint={23, 23, 52, 1, 55, 29, 17, 22, 34, 48, 25, 49, 37, 43, 58, 25, 6, 20, 50, 14, 55, 7, 21, 17, 26, 36, 43, 43, 4, 6, 3, 47} sdd/mq/3/tags:wake_batch=7 sdd/mq/3/tags:wake_index=0 sdd/mq/3/tags:ws={ sdd/mq/3/tags: {.wait_cnt=7, .wait=inactive}, sdd/mq/3/tags: {.wait_cnt=7, .wait=inactive}, sdd/mq/3/tags: {.wait_cnt=7, .wait=inactive}, sdd/mq/3/tags: {.wait_cnt=7, .wait=inactive}, sdd/mq/3/tags: {.wait_cnt=7, .wait=inactive}, sdd/mq/3/tags: {.wait_cnt=7, .wait=inactive}, sdd/mq/3/tags: {.wait_cnt=7, .wait=inactive}, sdd/mq/3/tags: {.wait_cnt=7, .wait=inactive}, sdd/mq/3/tags:} sdd/mq/3/tags:round_robin=0 sdd/mq/3/tags_bitmap:00000000: ffff ff1f 0000 0018 sdd/mq/3/cpu5/rq_list:ffff880401657440 {.cmd_flags=0x7a01, .rq_flags=0x2040, .tag=60, .internal_tag=-1} sdd/mq/3/cpu5/rq_list:ffff88037aba0000 {.cmd_flags=0x7a01, .rq_flags=0x2040, .tag=0, .internal_tag=-1} sdd/mq/3/cpu5/rq_list:ffff88037aba1740 {.cmd_flags=0x7a01, .rq_flags=0x2040, .tag=1, .internal_tag=-1} sdd/mq/3/cpu5/rq_list:ffff88037aba2e80 {.cmd_flags=0xca01, .rq_flags=0x2040, .tag=2, .internal_tag=-1} sdd/mq/3/cpu5/rq_list:ffff88037aba45c0 {.cmd_flags=0xca01, .rq_flags=0x2040, .tag=3, .internal_tag=-1} sdd/mq/3/cpu5/rq_list:ffff88037aba5d00 {.cmd_flags=0x7a01, .rq_flags=0x2040, .tag=4, .internal_tag=-1} sdd/mq/3/cpu5/rq_list:ffff88037aba7440 {.cmd_flags=0xca01, .rq_flags=0x2040, .tag=5, .internal_tag=-1} sdd/mq/3/cpu5/rq_list:ffff88037aba8b80 {.cmd_flags=0x7a01, .rq_flags=0x2040, .tag=6, .internal_tag=-1} sdd/mq/3/cpu5/rq_list:ffff88037abaa2c0 {.cmd_flags=0xca01, .rq_flags=0x2040, .tag=7, .internal_tag=-1} sdd/mq/3/cpu5/rq_list:ffff88037ababa00 {.cmd_flags=0xca01, .rq_flags=0x2040, .tag=8, .internal_tag=-1} sdd/mq/3/cpu5/rq_list:ffff88037abad140 {.cmd_flags=0xca01, .rq_flags=0x2040, .tag=9, .internal_tag=-1} sdd/mq/3/cpu5/rq_list:ffff88037abae880 {.cmd_flags=0xca01, .rq_flags=0x2040, .tag=10, .internal_tag=-1} sdd/mq/3/cpu5/rq_list:ffff880369900000 {.cmd_flags=0xca01, .rq_flags=0x2040, .tag=11, .internal_tag=-1} sdd/mq/3/cpu5/rq_list:ffff880369901740 {.cmd_flags=0xca01, .rq_flags=0x2040, .tag=12, .internal_tag=-1} sdd/mq/3/cpu5/rq_list:ffff880369902e80 {.cmd_flags=0xca01, .rq_flags=0x2040, .tag=13, .internal_tag=-1} sdd/mq/3/cpu5/rq_list:ffff8803699045c0 {.cmd_flags=0x7a01, .rq_flags=0x2040, .tag=14, .internal_tag=-1} sdd/mq/3/cpu5/rq_list:ffff880369905d00 {.cmd_flags=0x7a01, .rq_flags=0x2040, .tag=15, .internal_tag=-1} sdd/mq/3/cpu5/rq_list:ffff880369907440 {.cmd_flags=0x7a01, .rq_flags=0x2040, .tag=16, .internal_tag=-1} sdd/mq/3/cpu5/rq_list:ffff880369908b80 {.cmd_flags=0xca01, .rq_flags=0x2040, .tag=17, .internal_tag=-1} sdd/mq/3/cpu5/rq_list:ffff88036990a2c0 {.cmd_flags=0x7a01, .rq_flags=0x2040, .tag=18, .internal_tag=-1} sdd/mq/3/cpu5/rq_list:ffff88036990ba00 {.cmd_flags=0x7a01, .rq_flags=0x2040, .tag=19, .internal_tag=-1} sdd/mq/3/cpu5/rq_list:ffff88036990d140 {.cmd_flags=0x7a01, .rq_flags=0x2040, .tag=20, .internal_tag=-1} sdd/mq/3/cpu5/rq_list:ffff88036990e880 {.cmd_flags=0x7a01, .rq_flags=0x2040, .tag=21, .internal_tag=-1} sdd/mq/3/cpu5/rq_list:ffff8804001b0000 {.cmd_flags=0xca01, .rq_flags=0x2040, .tag=22, .internal_tag=-1} sdd/mq/3/cpu5/rq_list:ffff8804001b1740 {.cmd_flags=0xca01, .rq_flags=0x2040, .tag=23, .internal_tag=-1} sdd/mq/3/cpu5/rq_list:ffff8804001b2e80 {.cmd_flags=0xca01, .rq_flags=0x2040, .tag=24, .internal_tag=-1} sdd/mq/3/cpu5/rq_list:ffff8804001b45c0 {.cmd_flags=0xca01, .rq_flags=0x2040, .tag=25, .internal_tag=-1} sdd/mq/3/cpu5/rq_list:ffff8804001b5d00 {.cmd_flags=0x7a01, .rq_flags=0x2040, .tag=26, .internal_tag=-1} sdd/mq/3/cpu5/rq_list:ffff8804001b7440 {.cmd_flags=0xca01, .rq_flags=0x2040, .tag=27, .internal_tag=-1} sdd/mq/3/cpu5/rq_list:ffff8804001b8b80 {.cmd_flags=0x7a01, .rq_flags=0x2040, .tag=28, .internal_tag=-1} sde sde/mq/3/tags:nr_tags=62 sde/mq/3/tags:nr_reserved_tags=0 sde/mq/3/tags:active_queues=0 sde/mq/3/tags: sde/mq/3/tags:bitmap_tags: sde/mq/3/tags:depth=62 sde/mq/3/tags:busy=31 sde/mq/3/tags:bits_per_word=8 sde/mq/3/tags:map_nr=8 sde/mq/3/tags:alloc_hint={23, 23, 52, 1, 55, 29, 17, 22, 34, 48, 25, 49, 37, 43, 58, 25, 6, 20, 50, 14, 55, 7, 21, 17, 26, 36, 43, 43, 4, 6, 3, 47} sde/mq/3/tags:wake_batch=7 sde/mq/3/tags:wake_index=0 sde/mq/3/tags:ws={ sde/mq/3/tags: {.wait_cnt=7, .wait=inactive}, sde/mq/3/tags: {.wait_cnt=7, .wait=inactive}, sde/mq/3/tags: {.wait_cnt=7, .wait=inactive}, sde/mq/3/tags: {.wait_cnt=7, .wait=inactive}, sde/mq/3/tags: {.wait_cnt=7, .wait=inactive}, sde/mq/3/tags: {.wait_cnt=7, .wait=inactive}, sde/mq/3/tags: {.wait_cnt=7, .wait=inactive}, sde/mq/3/tags: {.wait_cnt=7, .wait=inactive}, sde/mq/3/tags:} sde/mq/3/tags:round_robin=0 sde/mq/3/tags_bitmap:00000000: ffff ff1f 0000 0018 sdf sdg sdh sdi sdj sr0 I am using the "none" scheduler: # cat /sys/class/block/sdd/queue/scheduler [none] # cat /sys/class/block/sde/queue/scheduler [none] What is remarkable is that I see pending requests for the sd* devices but not for any dm* device and also that the number of busy requests (31) is exactly half of the queue depth (62). Could this indicate that the block layer stopped processing these blk-mq queues? If this happens and I run the following command to trigger SRP logout: # for p in /sys/class/srp_remote_ports/*; do echo 1 >$p/delete; done then the test that was running finishes, reports that removing the multipath device failed and echo w >/proc/sysrq-trigger produces the following output: sysrq: SysRq : Show Blocked State task PC stack pid father systemd-udevd D 0 14490 508 0x00000106 Call Trace: __schedule+0x3df/0xc10 schedule+0x38/0x90 io_schedule+0x11/0x40 __lock_page+0x10c/0x140 truncate_inode_pages_range+0x45d/0x780 truncate_inode_pages+0x10/0x20 kill_bdev+0x30/0x40 __blkdev_put+0x71/0x220 blkdev_put+0x49/0x170 blkdev_close+0x20/0x30 __fput+0xe8/0x1f0 ____fput+0x9/0x10 task_work_run+0x80/0xb0 do_exit+0x30c/0xc70 do_group_exit+0x4b/0xc0 get_signal+0x2c2/0x930 do_signal+0x23/0x670 exit_to_usermode_loop+0x5d/0xa0 do_syscall_64+0xd5/0x140 entry_SYSCALL64_slow_path+0x25/0x25 Does this indicate that truncate_inode_pages_range() is waiting because a block layer queue got stuck? The kernel tree I used in my tests is the result of merging the following commits: * commit 3dca2c2f3d3b from git://git.kernel.dk/linux-block.git ("Merge branch 'for-4.12/block' into for-next") * commit f88ab0c4b481 from git://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git ("scsi: libsas: fix ata xfer length") * commit ad0376eb1483 from git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git ("Merge tag 'edac_for_4.11_2' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp") Please let me know if you need more information. Thanks, Bart.