Re: [PATCH] mm, memcg: clear page protection when memcg oom group happens

Yafang Shao <laoar.shao@xxxxxxxxx> · Tue, 26 Nov 2019 17:35:59 +0800




On Tue, Nov 26, 2019 at 3:31 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote:
>
> On Tue 26-11-19 11:52:19, Yafang Shao wrote:
> > On Mon, Nov 25, 2019 at 10:42 PM Johannes Weiner <hannes@xxxxxxxxxxx> wrote:
> > >
> > > On Mon, Nov 25, 2019 at 03:21:50PM +0100, Michal Hocko wrote:
> > > > On Mon 25-11-19 22:11:15, Yafang Shao wrote:
> > > > > When there're no processes, we don't need to protect the pages. You
> > > > > can consider it as 'fault tolerance' .
> > > >
> > > > I have already tried to explain why this is a bold statement that
> > > > doesn't really hold universally and that the kernel doesn't really have
> > > > enough information to make an educated guess.
> > >
> > > I agree, this is not obviously true. And the kernel shouldn't try to
> > > guess whether the explicit userspace configuration is still desirable
> > > to userspace or not. Should we also delete the cgroup when it becomes
> > > empty for example?
> > >
> > > It's better to implement these kinds of policy decisions from
> > > userspace.
> > >
> > > There is a cgroup.events file that can be polled, and its "populated"
> > > field shows conveniently whether there are tasks in a subtree or
> > > not. You can use that to clear protection settings.
> >
> > Why isn't force_empty supported in cgroup2 ?
>
> There wasn't any sound usecase AFAIR.
>
> > In this case we can free the protected file pages immdiately with force_empty.
>
> You can do the same thing by setting the hard limit to 0.

I look though the code, and the difference between setting the hard
limit to 0 and force empty is that setting the hard limit to 0 will
generate some OOM reports, that should not happen in this case.
I think we should make little improvement as bellow,

@@ -6137,9 +6137,11 @@ static ssize_t memory_max_write(struct
kernfs_open_file *of,
                        continue;
                }

-               memcg_memory_event(memcg, MEMCG_OOM);
-               if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0))
-                       break;
+               if (cgroup_is_populated(memcg->css.cgroup)) {
+                       memcg_memory_event(memcg, MEMCG_OOM);
+                       if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0))
+                               break;
+               }
        }

Well,  if someone don't want to kill proesses but only want ot drop
page caches, setting the hard limit to 0 won't work.

Thanks
Yafang


Thanks
Yafang