Re: [RFC] memory reserve for userspace oom-killer

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 4 May 2021 17:37:32 Shakeel Butt wrote:
>On Wed, Apr 21, 2021 at 7:29 AM Michal Hocko <mhocko@xxxxxxxx> wrote:
>[...]
>> > > What if the pool is depleted?
>> >
>> > This would mean that either the estimate of mempool size is bad or
>> > oom-killer is buggy and leaking memory.
>> >
>> > I am open to any design directions for mempool or some other way where
>> > we can provide a notion of memory guarantee to oom-killer.
>>
>> OK, thanks for clarification. There will certainly be hard problems to
>> sort out[1] but the overall idea makes sense to me and it sounds like a
>> much better approach than a OOM specific solution.
>>
>>
>> [1] - how the pool is going to be replenished without hitting all
>> potential reclaim problems (thus dependencies on other all tasks
>> directly/indirectly) yet to not rely on any background workers to do
>> that on the task behalf without a proper accounting etc...
>> --
>
>I am currently contemplating between two paths here:
>
>First, the mempool, exposed through either prctl or a new syscall.
>Users would need to trace their userspace oom-killer (or whatever
>their use case is) to find an appropriate mempool size they would need
>and periodically refill the mempools if allowed by the state of the
>machine. The challenge here is to find a good value for the mempool
>size and coordinating the refilling of mempools.
>
>Second is a mix of Roman and Peter's suggestions but much more
>simplified. A very simple watchdog with a kill-list of processes and
>if userspace didn't pet the watchdog within a specified time, it will
>kill all the processes in the kill-list. The challenge here is to
>maintain/update the kill-list.
>
>I would prefer the direction which oomd and lmkd are open to adopt.
>
>Any suggestions?
>
The kill list is drained once the wd starts doing its job AFAICT, would
you likely specify a bit on what matters if a process goes home before
waking up the wd? What will happen if it would signal the pid of a
non-existing one with the consequences all ignored?

Other than that, what is in your mind over the challenge to maintain the
kill-list?

Hillf




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux