On Fri, May 12, 2023 at 11:45:47AM +1000, Dave Chinner wrote: > > Yeah, this is papering over the observed symptom, not addressing the > root cause of the inodegc flush delay. What do you see when you run > sysrq-w and sysrq-l? Are there inodegc worker threads blocked > performing inodegc? I will try this next time we encounter this. > e.g. inodegc flushes could simply be delayed by an unlinked inode > being processed that has millions of extents that need to be freed. > > In reality, inode reclaim can block for long periods of time > on any filesystem, so the concept of "inode reclaim should > not block when PF_EXITING" is not a behaviour that we guarantee > anywhere or could guarantee across the board. > > Let's get to the bottom of why inodegc has apparently stalled before > trying to work out how to fix it... I'm happy to try, but I think it is also worth applying this patch. Like I said in the other thread, having to evac a box to get rid of an unkillable userspace process is annoying. Thanks for the debugging tips. Tycho