On 4/8/22 04:37, Thomas Gleixner wrote: > On Fri, Apr 08 2022 at 10:15, Peter Zijlstra wrote: >> On Thu, Apr 07, 2022 at 11:28:09PM -0400, Nico Pache wrote: >>> Theoretically a failure can still occur if there are locks mapped as >>> PRIVATE|ANON; however, the robust futexes are a best-effort approach. >>> This patch only strengthens that best-effort. >>> >>> The following case can still fail: >>> robust head (skipped) -> private lock (reaped) -> shared lock >>> (skipped) >> >> This is still all sorts of confused.. it's a list head, the entries can >> be in any random other VMA. You must not remove *any* user memory before >> doing the robust thing. Not removing the VMA that contains the head is >> pointless in the extreme. >> >> Did you not read the previous discussion? > > Aside of that we all agreed that giving a oom-killed task time to > cleanup itself instead of brute force cleaning it up immediately, which > is the real problem here. Can we fix that first before adding broken > heuristics? We've tried multiple approaches to reproduce the case you are talking about with no success... Why make a change for something that we cant reproduce when we are sure this works for all the cases we've attempted. I also dont see how this a broken heuristic... If anything adding a delay is broken. How do we assure the delay is long enough for the exit to clean up the futexes? In a heavily contended CPU with high memory pressure the delay may also lead to other processes unnecessarily OOMing. Cheers, -- Nico > > Thanks, > > tglx > >