Re: [PATCH] mm, memcg: clear page protection when memcg oom group happens

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 26, 2019 at 6:22 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote:
>
> On Tue 26-11-19 18:02:27, Yafang Shao wrote:
> > On Tue, Nov 26, 2019 at 5:50 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote:
> > >
> > > On Tue 26-11-19 17:35:59, Yafang Shao wrote:
> > > > On Tue, Nov 26, 2019 at 3:31 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote:
> > > > >
> > > > > On Tue 26-11-19 11:52:19, Yafang Shao wrote:
> > > > > > On Mon, Nov 25, 2019 at 10:42 PM Johannes Weiner <hannes@xxxxxxxxxxx> wrote:
> > > > > > >
> > > > > > > On Mon, Nov 25, 2019 at 03:21:50PM +0100, Michal Hocko wrote:
> > > > > > > > On Mon 25-11-19 22:11:15, Yafang Shao wrote:
> > > > > > > > > When there're no processes, we don't need to protect the pages. You
> > > > > > > > > can consider it as 'fault tolerance' .
> > > > > > > >
> > > > > > > > I have already tried to explain why this is a bold statement that
> > > > > > > > doesn't really hold universally and that the kernel doesn't really have
> > > > > > > > enough information to make an educated guess.
> > > > > > >
> > > > > > > I agree, this is not obviously true. And the kernel shouldn't try to
> > > > > > > guess whether the explicit userspace configuration is still desirable
> > > > > > > to userspace or not. Should we also delete the cgroup when it becomes
> > > > > > > empty for example?
> > > > > > >
> > > > > > > It's better to implement these kinds of policy decisions from
> > > > > > > userspace.
> > > > > > >
> > > > > > > There is a cgroup.events file that can be polled, and its "populated"
> > > > > > > field shows conveniently whether there are tasks in a subtree or
> > > > > > > not. You can use that to clear protection settings.
> > > > > >
> > > > > > Why isn't force_empty supported in cgroup2 ?
> > > > >
> > > > > There wasn't any sound usecase AFAIR.
> > > > >
> > > > > > In this case we can free the protected file pages immdiately with force_empty.
> > > > >
> > > > > You can do the same thing by setting the hard limit to 0.
> > > >
> > > > I look though the code, and the difference between setting the hard
> > > > limit to 0 and force empty is that setting the hard limit to 0 will
> > > > generate some OOM reports, that should not happen in this case.
> > > > I think we should make little improvement as bellow,
> > >
> > > Yes, if you are not able to reclaim all of the memory then the OOM
> > > killer is triggered. And that was not the case with force_empty. I
> > > didn't mean that the two are equivalent, sorry if I misled you.
> > > I merely wanted to point out that you have means to cleanup the memcg
> > > with the existing API.
> > >
> > > > @@ -6137,9 +6137,11 @@ static ssize_t memory_max_write(struct
> > > > kernfs_open_file *of,
> > > >                         continue;
> > > >                 }
> > > >
> > > > -               memcg_memory_event(memcg, MEMCG_OOM);
> > > > -               if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0))
> > > > -                       break;
> > > > +               if (cgroup_is_populated(memcg->css.cgroup)) {
> > > > +                       memcg_memory_event(memcg, MEMCG_OOM);
> > > > +                       if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0))
> > > > +                               break;
> > > > +               }
> > > >         }
> > >
> > > If there are no killable tasks then
> > >         "Out of memory and no killable processes..."
> > > is printed and that really reflects the situation and is the right thing
> > > to do. Your above patch would suppress that information which might be
> > > important.
> > >
> >
> > Not only this output.
> > Pls. see dump_header(), many outputs and even worse is that the
> > dump_stack() is also executed.
>
> Yes, there will be the full oom report. I have outlined the "no
> killable" part because this is the main distinguisher for the "no tasks"
> case.
>

But the case here is there is "no tasks"
(!cgroup_is_populated(memcg->css.cgroup)), rather than "no killable".
This output is really a misleading.

> > > > Well,  if someone don't want to kill proesses but only want ot drop
> > > > page caches, setting the hard limit to 0 won't work.
> > >
> > > Could you be more specific about a real world example when somebody
> > > wants to drop per-memcg pagecache?
> >
> > For example, if one memcg  has lots of negtive denties,  that causes
> > the file page cache continuesly been reclaimed, so we want to drop all
> > these negtive dentries. force_empty is a better workaround so far, and
> > that can give us more chance to analyze why negtive dentries are
> > generated.
>
> force_empty sounds like a brute force to clean negative dentries TBH.
> And it is not really way too much different from shrinking the hard
> limit.
>
> Why doesn't a normal reclaim work for those situation? Anyway, this is
> getting really tangent to the original topic so I would suggest to start
> a new email thread with a clear description of a problem you are facing
> and we can go from there.

Sure.
I will start a new thread.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux