On Tue, Sep 15, 2020 at 12:01 PM Matthieu Baerts <matthieu.baerts@xxxxxxxxxxxx> wrote: > > Earlier today, I got one trace with 'sysrq-T' but it is more than 1100 > lines. It is attached to this email also with a version from > "decode_stacktrace.sh", I hope that's alright. Yeah, there's nothing interesting there. The only relevant tasks seem to be the packetdrill ones that are blocked on the page lock. I don't see anything that looks even *remotely* like it could be holding a page lock and be waiting for anything else. A couple of pipe readers, a number of parents waiting on their children, one futex waiter, one select loop.. Nothing at all unexpected or remotely suspicious. The packetdrill ones look very similar. > I forgot one important thing, I was on top of David Miller's net-next > branch by reflex. I can redo the traces on top of linux-next if needed. Not likely an issue. I'll go stare at the page lock code again to see if I've missed anything. I still suspect it's a latent ABBA deadlock that is just much *much* easier to trigger with the synchronous lock handoff, but I don't see where it is. I guess this is all fairly theoretical since we apparently need to do that hybrid "limited fairness" patch anyway, and it fixes your issue, but I hate not understanding the problem. Linus