Re: OOM kills with lots of free swap

Luigi Semenzato <semenzato@xxxxxxxxxx> · Thu, 29 Jun 2017 10:46:37 -0700

Well, my apologies, I haven't been able to reproduce the problem, so
there's nothing to go on here.

We had a bug (a local patch) which caused this, then I had a bug in my
test case, so I was confused.  I also have a recollection of this
happening in older kernels (3.8 I think), but I am not going to go
back that far since even if the problem exists, we have no evidence it
happens frequently.

Thanks!

On Tue, Jun 27, 2017 at 8:50 AM, Michal Hocko <mhocko@xxxxxxxxxx> wrote:
> On Tue 27-06-17 08:22:36, Luigi Semenzato wrote:
>> (sorry, I forgot to turn off HTML formatting)
>>
>> Thank you, I can try this on ToT, although I think that the problem is
>> not with the OOM killer itself but earlier---i.e. invoking the OOM
>> killer seems unnecessary and wrong.  Here's the question.
>>
>> The general strategy for page allocation seems to be (please correct
>> me as needed):
>>
>> 1. look in the free lists
>> 2. if that did not succeed, try to reclaim, then try again to allocate
>> 3. keep trying as long as progress is made (i.e. something was reclaimed)
>> 4. if no progress was made and no pages were found, invoke the OOM killer.
>
> Yes that is the case very broadly speaking. The hard question really is
> what "no progress" actually means. We use "no pages could be reclaimed"
> as the indicator. We cannot blow up at the first such instance of
> course because that could be too early (e.g. data under writeback
> and many other details). With 4.7+ kernels this is implemented in
> should_reclaim_retry. Prior to the rework we used to rely on
> zone_reclaimable which simply checked how many pages we have scanned
> since the last page has been freed and if that is 6 times the
> reclaimable memory then we simply give up. It had some issues described
> in 0a0337e0d1d1 ("mm, oom: rework oom detection").
>
>> I'd like to know if that "progress is made" notion is possibly buggy.
>> Specifically, does it mean "progress is made by this task"?  Is it
>> possible that resource contention creates a situation where most tasks
>> in most cases can reclaim and allocate, but one task randomly fails to
>> make progress?
>
> This can happen, alhtough it is quite unlikely. We are trying to
> throttle allocations but you can hardly fight a consistent badluck ;)
>
> In order to see what is going on in your particular case we need an oom
> report though.
> --
> Michal Hocko
> SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>