On Tue, Dec 7, 2021 at 8:58 PM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: > > On Tue, 7 Dec 2021 19:46:57 -0500 Nico Pache <npache@xxxxxxxxxx> wrote: > > > > > > > On 12/7/21 18:47, Andrew Morton wrote: > > > (cc's added) > > > > > > On Tue, 7 Dec 2021 16:49:02 -0500 Joel Savitz <jsavitz@xxxxxxxxxx> wrote: > > > > > >> In the case that two or more processes share a futex located within > > >> a shared mmaped region, such as a process that shares a lock between > > >> itself and a number of child processes, we have observed that when > > >> a process holding the lock is oom killed, at least one waiter is never > > >> alerted to this new development and simply continues to wait. > > > > > > Well dang. Is there any way of killing off that waiting process, or do > > > we have a resource leak here? > > > > If I understood your question correctly, there is a way to recover the system by > > killing the process that is utilizing the futex; however, the purpose of robust > > futexes is to avoid having to do this. > > OK. My concern was whether we have a way in which userspace can > permanently leak memory, which opens a (lame) form of denial-of-service > attack. I believe the resources are freed when the process is killed so to my knowledge there is no resource leak in the case we were investigating. > > >From my work with Joel on this it seems like a race is occurring between the > > oom_reaper and the exit signal sent to the OMM'd process. By setting the > > futex_exit_release before these signals are sent we avoid this. > > OK. It would be nice if the patch had some comments explaining *why* > we're doing this strange futex thing here. Although that wouldn't be > necessary if futex_exit_release() was documented... > Sounds good, will send a v2 tomorrow Best, Joel Savitz