On Mon, Feb 15, 2021 at 02:36:38PM +0100, Donald Buczek wrote: > On 13.01.21 22:53, Dave Chinner wrote: > > [...] > > I agree that a throttling fix is needed, but I'm trying to > > understand the scope and breadth of the problem first instead of > > jumping the gun and making the wrong fix for the wrong reasons that > > just papers over the underlying problems that the throttling bug has > > made us aware of... > > Are you still working on this? > > If it takes more time to understand the potential underlying problem, the fix for the problem at hand should be applied. > > This is a real world problem, accidentally found in the wild. It appears very rarely, but it freezes a filesystem or the whole system. It exists in 5.7 , 5.8 , 5.9 , 5.10 and 5.11 and is caused by c7f87f3984cf ("xfs: fix use-after-free on CIL context on shutdown") which silently added a condition to the wakeup. The condition is based on a wrong assumption. > > Why is this "papering over"? If a reminder was needed, there were better ways than randomly hanging the system. > > Why is > > if (ctx->space_used >= XLOG_CIL_BLOCKING_SPACE_LIMIT(log)) > wake_up_all(&cil->xc_push_wait); > > , which doesn't work reliably, preferable to > > if (waitqueue_active(&cil->xc_push_wait)) > wake_up_all(&cil->xc_push_wait); > > which does? > JFYI, Dave followed up with a patch a couple weeks or so ago: https://lore.kernel.org/linux-xfs/20210128044154.806715-5-david@xxxxxxxxxxxxx/ Brian > Best > Donald > > > Cheers, > > > > Dave >