On Tue, 7 Dec 2021 19:46:57 -0500 Nico Pache <npache@xxxxxxxxxx> wrote: > > > On 12/7/21 18:47, Andrew Morton wrote: > > (cc's added) > > > > On Tue, 7 Dec 2021 16:49:02 -0500 Joel Savitz <jsavitz@xxxxxxxxxx> wrote: > > > >> In the case that two or more processes share a futex located within > >> a shared mmaped region, such as a process that shares a lock between > >> itself and a number of child processes, we have observed that when > >> a process holding the lock is oom killed, at least one waiter is never > >> alerted to this new development and simply continues to wait. > > > > Well dang. Is there any way of killing off that waiting process, or do > > we have a resource leak here? > > If I understood your question correctly, there is a way to recover the system by > killing the process that is utilizing the futex; however, the purpose of robust > futexes is to avoid having to do this. OK. My concern was whether we have a way in which userspace can permanently leak memory, which opens a (lame) form of denial-of-service attack. > >From my work with Joel on this it seems like a race is occurring between the > oom_reaper and the exit signal sent to the OMM'd process. By setting the > futex_exit_release before these signals are sent we avoid this. OK. It would be nice if the patch had some comments explaining *why* we're doing this strange futex thing here. Although that wouldn't be necessary if futex_exit_release() was documented...