Andy Lutomirski wrote on Thu, Dec 17, 2015: > This could be QEMU's analysis script screwing up. Is there a good way > for me to gather more info? I finally took some time to reproduce it (sorry for the delay) Using your config, virtme commit (17363c2) and kernel tag v4.4-rc3 I was able to reproduce it just fine with my qemu (2.4.90) Now for the fun bit... I ran it with a gdb server, attaching gdb and running cont always 'unblocks' it Using the kernel gdb scripts (lx-ps) I see about 250 kworker threads running, the backtraces all look the same: [ 20.273945] [<ffffffff818c3910>] schedule+0x30/0x80 [ 20.274644] [<ffffffff818c3c39>] schedule_preempt_disabled+0x9/0x10 [ 20.275539] [<ffffffff818c6147>] __mutex_lock_slowpath+0x107/0x2f0 [ 20.276421] [<ffffffff811cf02e>] ? lookup_fast+0xbe/0x320 [ 20.277195] [<ffffffff818c6345>] mutex_lock+0x15/0x30 [ 20.277916] [<ffffffff811d0df7>] walk_component+0x1a7/0x270 so given it unblocks after hooking gdb + cont I'm actually thinking this might be a pure scheduling issue? (e.g. thread is never re-scheduled or something like that?) I can't see any task not in schedule() in your sysrq dump task transcript either. Not sure how to go around debugging that, to be honest. I've tried both default one virtual cpu and -smp 3 or 4 and both can reproduce it; cpu usage on the host is always low so it doesn't look like there's any busy-polling involved.. This is a pretty subtle bug we have there.. -- Dominique Martinet -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html