Hello, On Fri, Apr 17, 2020 at 09:11:33AM -0700, Shakeel Butt wrote: > On Thu, Apr 16, 2020 at 6:06 PM Jakub Kicinski <kuba@xxxxxxxxxx> wrote: > > > > Tejun describes the problem as follows: > > > > When swap runs out, there's an abrupt change in system behavior - > > the anonymous memory suddenly becomes unmanageable which readily > > breaks any sort of memory isolation and can bring down the whole > > system. > > Can you please add more info on this abrupt change in system behavior > and what do you mean by anon memory becoming unmanageable? In the sense that anonymous memory becomes essentially memlocked. > Once the system is in global reclaim and doing swapping the memory > isolation is already broken. Here I am assuming you are talking about There currently are issues with anonymous memory management which makes them different / worse than page cache but I don't follow why swapping necessarily means that isolation is broken. Page refaults don't indicate that memory isolation is broken after all. > memcg limit reclaim and memcg limits are overcommitted. Shouldn't > running out of swap will trigger the OOM earlier which should be > better than impacting the whole system. The primary scenario which was being considered was undercommitted protections but I don't think that makes any relevant differences. This is exactly similar to delay injection for memory.high. What's desired is slowing down the workload as the available resource is depleted so that the resource shortage presents as gradual degradation of performance and matching increase in resource PSI. This allows the situation to be detected and handled from userland while avoiding sudden and unpredictable behavior changes. Thanks. -- tejun