On Tue, Nov 26, 2019 at 3:31 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > On Tue 26-11-19 11:52:19, Yafang Shao wrote: > > On Mon, Nov 25, 2019 at 10:42 PM Johannes Weiner <hannes@xxxxxxxxxxx> wrote: > > > > > > On Mon, Nov 25, 2019 at 03:21:50PM +0100, Michal Hocko wrote: > > > > On Mon 25-11-19 22:11:15, Yafang Shao wrote: > > > > > When there're no processes, we don't need to protect the pages. You > > > > > can consider it as 'fault tolerance' . > > > > > > > > I have already tried to explain why this is a bold statement that > > > > doesn't really hold universally and that the kernel doesn't really have > > > > enough information to make an educated guess. > > > > > > I agree, this is not obviously true. And the kernel shouldn't try to > > > guess whether the explicit userspace configuration is still desirable > > > to userspace or not. Should we also delete the cgroup when it becomes > > > empty for example? > > > > > > It's better to implement these kinds of policy decisions from > > > userspace. > > > > > > There is a cgroup.events file that can be polled, and its "populated" > > > field shows conveniently whether there are tasks in a subtree or > > > not. You can use that to clear protection settings. > > > > Why isn't force_empty supported in cgroup2 ? > > There wasn't any sound usecase AFAIR. > > > In this case we can free the protected file pages immdiately with force_empty. > > You can do the same thing by setting the hard limit to 0. I look though the code, and the difference between setting the hard limit to 0 and force empty is that setting the hard limit to 0 will generate some OOM reports, that should not happen in this case. I think we should make little improvement as bellow, @@ -6137,9 +6137,11 @@ static ssize_t memory_max_write(struct kernfs_open_file *of, continue; } - memcg_memory_event(memcg, MEMCG_OOM); - if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0)) - break; + if (cgroup_is_populated(memcg->css.cgroup)) { + memcg_memory_event(memcg, MEMCG_OOM); + if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0)) + break; + } } Well, if someone don't want to kill proesses but only want ot drop page caches, setting the hard limit to 0 won't work. Thanks Yafang Thanks Yafang