On Mon, Sep 04, 2023 at 11:45:03AM +1000, Dave Chinner wrote: > > thread B: write() > > finds file > > grabs ->f_pos_lock > > calls into filesystem > > blocks on fs lock held by A > > thread C: read()/write()/lseek() on the same file > > blocks on ->f_pos_lock > > Yes, that's exactly what I said in a followup email - we need to > know what happened to thread A, because that might be where we are > stuck on a leaked lock. > > I saw quite a few reports where lookup/readdir are also stuck trying > to get an inode lock - those at the "thread B"s in the above example > - but there's no indication left of what happened with thread A. > > If thread A was blocked iall that time on something, then the hung > task timer should fire on it, too. If it is running in a tight > loop, the NMI would have dumped a stack trace from it. > > But neither of those things happened, so it's either leaked > something or it's in a loop with a short term sleep so doesn't > trigger the hung task timer. sysrq-w output will capture that > without all the noise of sysrq-t.... Here's what brought sysrq-t: | > The report does not have info necessary to figure this out -- no | > backtrace for whichever thread which holds f_pos_lock. I clicked on a | > bunch of other reports and it is the same story. | > | > Can the kernel be configured to dump backtraces from *all* threads? | > | > If there is no feature like that I can hack it up. | | <break>t | | over serial console, or echo t >/proc/sysrq-trigger would do it... A question specifically about getting the stack traces...