On Mon 18-10-21 17:25:23, Zhaoyang Huang wrote: > On Mon, Oct 18, 2021 at 4:23 PM Michal Hocko <mhocko@xxxxxxxx> wrote: > > > > On Fri 15-10-21 14:15:29, Huangzhaoyang wrote: > > > From: Zhaoyang Huang <zhaoyang.huang@xxxxxxxxxx> > > > > > > Sibling thread of the same process could refault the reclaimed pages > > > in the same time, which would be typical in None global reclaim and > > > introduce thrashing. > > > > It is hard to understand what kind of problem you see (ideally along > > with some numbers) and how the proposed patch addresses that problem > > > > Also you are missing Signed-off-by tag (please have a look at > > Documentation/process/submitting-patches.rst which is much more > > comprehensive about the process). > sorry for that, I will fix it. > > > > > --- > > > mm/vmscan.c | 5 +++++ > > > 1 file changed, 5 insertions(+) > > > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > > index 5199b96..ebbdc37 100644 > > > --- a/mm/vmscan.c > > > +++ b/mm/vmscan.c > > > @@ -2841,6 +2841,11 @@ static void shrink_node_memcgs(pg_data_t *pgdat, struct scan_control *sc) > > > sc->memcg_low_skipped = 1; > > > continue; > > > } > > > + /* > > > + * Don't bother current when its memcg is below low > > > + */ > > > + if (get_mem_cgroup_from_mm(current->mm) == memcg) > > > + continue; > > > > This code is executed when none of memcg in the reclaimed hierarchy > > could be reclaimed. Low limit is then ignored and this change is > > tweaking that behavior without any description of the effect. A very > > vague note about trashing would indicate that you have something like > > the following > > > > A (hiting hard limit) > > / \ > > B C > > > > Both B and C low limit protected and current task associated with B. As > > none of the two could be reclaimed due to soft protection yuu prefer to > > reclaim from C as you do not want to reclaim from the current process as > > that could reclaim current's working set. Correct? > > > > I would be really curious about more specifics of the used hierarchy. > What I am facing is a typical scenario on Android, that is a big > memory consuming APP(camera etc) launched while background filled by > other processes. The hierarchy is like what you describe above where B > represents the APP and memory.low is set to help warm restart. Both of > kswapd and direct reclaim work together to reclaim pages under this > scenario, which can cause 20MB file page delete from LRU in several > second. This change could help to have current process's page escape > from being reclaimed and cause page thrashing. We observed the result > via systrace which shows that the Uninterruptible sleep(block on page > bit) and iowait get smaller than usual. I still have hard time to understand the exact setup and why the patch helps you. If you want to protect B more than the low limit would allow for by stealiong from C then the same thing can happen from anybody reclaiming from C so in the end there is no protection. The same would apply for any global direct memory reclaim done by a 3rd party. So I suspect that your patch just happens to work by a luck. Why both B and C have low limit setup and they both cannot be reclaimed? Isn't that a weird setup where A hard limit is too close to sum of low limits of B and C? In other words could you share a more detailed configuration you are using and some more details why both B and C have been skipped during the first pass of the reclaim? -- Michal Hocko SUSE Labs