On Mon 03-08-20 22:18:52, Yafang Shao wrote: > On Mon, Aug 3, 2020 at 9:56 PM Michal Hocko <mhocko@xxxxxxxx> wrote: > > > > On Mon 03-08-20 21:20:44, Yafang Shao wrote: > > > On Mon, Aug 3, 2020 at 6:12 PM Michal Hocko <mhocko@xxxxxxxx> wrote: > > > > > > > > On Fri 31-07-20 09:50:04, Yafang Shao wrote: > > > > > On Thu, Jul 30, 2020 at 7:26 PM Michal Hocko <mhocko@xxxxxxxx> wrote: > > > > > > > > > > > > On Tue 28-07-20 03:40:32, Yafang Shao wrote: > > > > > > > Sometimes we use memory.force_empty to drop pages in a memcg to work > > > > > > > around some memory pressure issues. When we use force_empty, we want the > > > > > > > pages can be reclaimed ASAP, however force_empty reclaims pages as a > > > > > > > regular reclaimer which scans the page cache LRUs from DEF_PRIORITY > > > > > > > priority and finally it will drop to 0 to do full scan. That is a waste > > > > > > > of time, we'd better do full scan initially in force_empty. > > > > > > > > > > > > Do you have any numbers please? > > > > > > > > > > > > > > > > Unfortunately the number doesn't improve obviously, while it is > > > > > directly proportional to the numbers of total pages to be scanned. > > > > > > > > Your changelog claims an optimization and that should be backed by some > > > > numbers. It is true that reclaim at a higher priority behaves slightly > > > > and subtly differently but that urge for even more details in the > > > > changelog. > > > > > > > > > > With the below addition change (nr_to_scan also changed), the elapsed > > > time of force_empty can be reduced by 10%. > > > > > > @@ -3208,6 +3211,7 @@ static inline bool memcg_has_children(struct > > > mem_cgroup *memcg) > > > static int mem_cgroup_force_empty(struct mem_cgroup *memcg) > > > { > > > int nr_retries = MEM_CGROUP_RECLAIM_RETRIES; > > > + unsigned long size; > > > > > > /* we call try-to-free pages for make this cgroup empty */ > > > lru_add_drain_all(); > > > @@ -3215,14 +3219,15 @@ static int mem_cgroup_force_empty(struct > > > mem_cgroup *memcg) > > > drain_all_stock(memcg); > > > /* try to free all pages in this cgroup */ > > > - while (nr_retries && page_counter_read(&memcg->memory)) { > > > + while (nr_retries && (size = page_counter_read(&memcg->memory))) { > > > int progress; > > > > > > if (signal_pending(current)) > > > return -EINTR; > > > - progress = try_to_free_mem_cgroup_pages(memcg, 1, > > > - GFP_KERNEL, true); > > > + progress = try_to_free_mem_cgroup_pages(memcg, size, > > > + GFP_KERNEL, true, > > > + 0); > > > > Have you tried this change without changing the reclaim priority? > > > > I tried it again. Seems the improvement is mostly due to the change of > nr_to_reclaim, rather the reclaim priority, > > - progress = try_to_free_mem_cgroup_pages(memcg, 1, > + progress = try_to_free_mem_cgroup_pages(memcg, size, This is what I've expected. The reclaim priority might have some side effects as well but that requires very specific conditions when the reclaim really has to dive to large scan windows to make some progress. It would be interesting to find out where the improvement comes from and how stable those numbers are. Because normally it shouldn't matter much whether you make N rounds over the reclaim with a smaller target or do the reclaim in a single round. -- Michal Hocko SUSE Labs