Hi Jens, While testing md/raid5 with the journal option using loop devices, I've found an easily reproducible hang on my system. Simply running an fio write job with the md threadcnt set to 4 can hit it. However, curiously, it is not hit without the journal being used. I'm running on the current md/md-next branch; however I've seen this bug for a couple months now on recent kernels and have no idea how long it's been in the kernel for. I end up seeing multiple hung tasks with the following stack trace: schedule+0x9e/0x140 io_schedule+0x70/0xb0 rq_qos_wait+0x153/0x210 wbt_wait+0x127/0x1f0 __rq_qos_throttle+0x38/0x60 blk_mq_submit_bio+0x589/0xcd0 __submit_bio+0xe6/0x100 submit_bio_noacct_nocheck+0x42e/0x470 submit_bio_noacct+0x4c2/0xbb0 ops_run_io+0x46b/0x1a30 handle_stripe+0xcd3/0x36c0 handle_active_stripes.constprop.0+0x6f6/0xa60 raid5_do_work+0x177/0x330 process_one_work+0x609/0xb00 worker_thread+0x2d4/0x710 kthread+0x18c/0x1c0 ret_from_fork+0x1f/0x30 When this happens, I find 1 to 10ish inflight IO on the WBT of the underlying loop devices as seen in '/sys/kernel/debug/block/loop[0-3]/rqos/wbt/inflight'. I've done some debugging in this area and this is what I'm seeing: There are a few IO in the WBT that start scheduling when the inflight counter reaches the limit (96 in my case). Then, a number of IO tasks are scheduled after the limit gets exceeded. So far that makes sense. I put some tracing in wbt_rqw_done() and can see that the inflight counts back down to a low number as other IO are completed, but then it hangs before reaching zero. However, wbt_rqw_done() never wakes up any other threads because, for some reason, wb_recent_wait(rwb) always returns false and thus the limit is always zero, and the conditional: if (inflight && inflight >= limit) return; always gets hit because inflight is always greater than the zero limit (as some inflight IO are sleeping waiting to be worken up). Thus the sleeping tasks remain sleeping forever. I've also verified that rwb_wake_all() never gets called in this scenario as well. Given the conditions of hitting the bug, I fully expected this to be an issue in the raid code, but unless I'm missing something, it sure looks to me like a deadlock issue in the wbt code, which makes me wonder why nobody else has hit it. Is there something else I'm missing that are supposed to be waking up these processes? Or something weird about the raid5+journal+loop code that is causing wb_recent_wait() to always be false? Any thoughts? Thanks, Logan