On Thu, Apr 16, 2020 at 6:06 PM Jakub Kicinski <kuba@xxxxxxxxxx> wrote: > > Tejun describes the problem as follows: > > When swap runs out, there's an abrupt change in system behavior - > the anonymous memory suddenly becomes unmanageable which readily > breaks any sort of memory isolation and can bring down the whole > system. Can you please add more info on this abrupt change in system behavior and what do you mean by anon memory becoming unmanageable? Once the system is in global reclaim and doing swapping the memory isolation is already broken. Here I am assuming you are talking about memcg limit reclaim and memcg limits are overcommitted. Shouldn't running out of swap will trigger the OOM earlier which should be better than impacting the whole system. > To avoid that, oomd [1] monitors free swap space and triggers > kills when it drops below the specific threshold (e.g. 15%). > > While this works, it's far from ideal: > - Depending on IO performance and total swap size, a given > headroom might not be enough or too much. > - oomd has to monitor swap depletion in addition to the usual > pressure metrics and it currently doesn't consider memory.swap.max. > > Solve this by adapting the same approach that memory.high uses - > slow down allocation as the resource gets depleted turning the > depletion behavior from abrupt cliff one to gradual degradation > observable through memory pressure metric. > > [1] https://github.com/facebookincubator/oomd > > Jakub Kicinski (3): > mm: prepare for swap over-high accounting and penalty calculation > mm: move penalty delay clamping out of calculate_high_delay() > mm: automatically penalize tasks with high swap use > > include/linux/memcontrol.h | 4 + > mm/memcontrol.c | 166 ++++++++++++++++++++++++++++--------- > 2 files changed, 131 insertions(+), 39 deletions(-) > > -- > 2.25.2 >