On 2022-08-25 16:19, Logan Gunthorpe wrote: > Given the conditions of hitting the bug, I fully expected this to be an > issue in the raid code, but unless I'm missing something, it sure looks > to me like a deadlock issue in the wbt code, which makes me wonder why > nobody else has hit it. Is there something else I'm missing that are > supposed to be waking up these processes? Or something weird about the > raid5+journal+loop code that is causing wb_recent_wait() to always be false? I've made some progress on this nasty bug. I've got far enough to know it's not related to the blk-wbt or the block layer. Turns out a bunch of bios are stuck queued in a blk_plug in the md_raid5 thread while that thread appears to be stuck in an infinite loop (so it never schedules or does anything to flush the plug). I'm still debugging to try and find out the root cause of that infinite loop, but I just wanted to send an update that the previous place I was stuck at was not correct. Logan