Re: [PATCH v8] oom_kill.c: futex: Don't OOM reap the VMA containing the robust_list_head

Thomas Gleixner <tglx@xxxxxxxxxxxxx> · Mon, 11 Apr 2022 09:47:14 +0200

Michal,

On Mon, Apr 11 2022 at 08:48, Michal Hocko wrote:
> On Fri 08-04-22 23:41:11, Thomas Gleixner wrote:
>> So why would a process private robust mutex be any different from a
>> process shared one?
>
> Purely from the OOM POV they are slightly different because the OOM
> killer always kills all threads which share the mm with the selected
> victim (with an exception of the global init - see __oom_kill_process).
> Note that this is including those threads which are not sharing signals
> handling.
> So clobbering private locks shouldn't be observable to an alive thread
> unless I am missing something.

Yes, it kills everything, but the reaper also reaps non-shared VMAs. So
if the process private futex sits in a reaped VMA the shared one becomes
unreachable.

> On the other hand I do agree that delayed oom_reaper execution is a
> reasonable workaround and the most simplistic one.

I think it's more than a workaround. It's a reasonable expectation that
the kernel side of the user space threads can mop up the mess the user
space part created. So even if one of of N threads is stuck in a place
where it can't, then N-1 can still reach do_exit() and mop their mess
up.

The oom reaper is the last resort to resolve the situation in case of a
stuck task. No?

> If I understand your example code then we would need to evaluate the
> whole robust list and that is simply not feasible because that would
> require a #PF in general case.

Right. The robust list exit code does the user access with pagefaults
disabled and if it fails, it terminates the list walk. Bad luck :)

Thanks,

        tglx