Re: [LSF/MM TOPIC] proposals for topics

Johannes Weiner <hannes@xxxxxxxxxxx> · Tue, 26 Jan 2016 13:09:13 -0500



On Tue, Jan 26, 2016 at 06:07:52PM +0100, Vlastimil Babka wrote:
> On 01/25/2016 07:45 PM, Johannes Weiner wrote:
> >>>- One of the long lasting issue related to the OOM handling is when to
> >>>   actually declare OOM. There are workloads which might be trashing on
> >>>   few last remaining pagecache pages or on the swap which makes the
> >>>   system completely unusable for considerable amount of time yet the
> >>>   OOM killer is not invoked. Can we finally do something about that?
> >I'm working on this, but it's not an easy situation to detect.
> >
> >We can't decide based on amount of page cache, as you could have very
> >little of it and still be fine. Most of it could still be used-once.
> >
> >We can't decide based on number or rate of (re)faults, because this
> >spikes during startup and workingset changes, or can be even sustained
> >when working with a data set that you'd never expect to fit into
> >memory in the first place, while still making acceptable progress.
> 
> I would hope that workingset should help distinguish workloads thrashing due
> to low memory and those that can't fit there no matter what? Or would it
> require tracking lifetime of so many evicted pages that the memory overhead
> of that would be infeasible?

Yes, using the workingset code is exactly my plan. The only thing it
requires on top is a time component. Then we can kick the OOM killer
based on the share of time a workload (the system?) spends thrashing.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html