Re: [PATCH] mm/oom_kill: wake futex waiters before annihilating victim shared mutex

Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> · Tue, 7 Dec 2021 17:58:16 -0800

On Tue, 7 Dec 2021 19:46:57 -0500 Nico Pache <npache@xxxxxxxxxx> wrote:

> 
> 
> On 12/7/21 18:47, Andrew Morton wrote:
> > (cc's added)
> > 
> > On Tue,  7 Dec 2021 16:49:02 -0500 Joel Savitz <jsavitz@xxxxxxxxxx> wrote:
> > 
> >> In the case that two or more processes share a futex located within
> >> a shared mmaped region, such as a process that shares a lock between
> >> itself and a number of child processes, we have observed that when
> >> a process holding the lock is oom killed, at least one waiter is never
> >> alerted to this new development and simply continues to wait.
> > 
> > Well dang.  Is there any way of killing off that waiting process, or do
> > we have a resource leak here?
> 
> If I understood your question correctly, there is a way to recover the system by
> killing the process that is utilizing the futex; however, the purpose of robust
> futexes is to avoid having to do this.

OK.  My concern was whether we have a way in which userspace can
permanently leak memory, which opens a (lame) form of denial-of-service
attack.

> >From my work with Joel on this it seems like a race is occurring between the
> oom_reaper and the exit signal sent to the OMM'd process. By setting the
> futex_exit_release before these signals are sent we avoid this.

OK.  It would be nice if the patch had some comments explaining *why*
we're doing this strange futex thing here.  Although that wouldn't be
necessary if futex_exit_release() was documented...