On Thu, 28 May 2020 at 20:33, Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > On Fri 22-05-20 02:23:09, Naresh Kamboju wrote: > > My apology ! > > As per the test results history this problem started happening from > > Bad : next-20200430 (still reproducible on next-20200519) > > Good : next-20200429 > > > > The git tree / tag used for testing is from linux next-20200430 tag and reverted > > following three patches and oom-killer problem fixed. > > > > Revert "mm, memcg: avoid stale protection values when cgroup is above > > protection" > > Revert "mm, memcg: decouple e{low,min} state mutations from protectinn checks" > > Revert "mm-memcg-decouple-elowmin-state-mutations-from-protection-checks-fix" > > The discussion has fragmented and I got lost TBH. > In http://lkml.kernel.org/r/CA+G9fYuDWGZx50UpD+WcsDeHX9vi3hpksvBAWbMgRZadb0Pkww@xxxxxxxxxxxxxx > you have said that none of the added tracing output has triggered. Does > this still hold? Because I still have a hard time to understand how > those three patches could have the observed effects. On the other email thread [1] this issue is concluded. Yafang wrote on May 22 2020, Regarding the root cause, my guess is it makes a similar mistake that I tried to fix in the previous patch that the direct reclaimer read a stale protection value. But I don't think it is worth to add another fix. The best way is to revert this commit. [1] [PATCH v3 2/2] mm, memcg: Decouple e{low,min} state mutations from protection checks https://lore.kernel.org/linux-mm/CALOAHbArZ3NsuR3mCnx_kbSF8ktpjhUF2kaaTa7Mb7ocJajsQg@xxxxxxxxxxxxxx/ - Naresh > -- > Michal Hocko > SUSE Labs