Re: [LSF/MM TOPIC ATTEND]

Michal Hocko <mhocko@xxxxxxx> · Wed, 7 Jan 2015 15:38:58 +0100

On Wed 07-01-15 11:58:28, Vladimir Davydov wrote:
> On Tue, Jan 06, 2015 at 05:14:35PM +0100, Michal Hocko wrote:
> [...]
> > And as a memcg co-maintainer I would like to also discuss the following
> > topics.
> > - We should finally settle down with a set of core knobs exported with
> >   the new unified hierarchy cgroups API. I have proposed this already
> >   http://marc.info/?l=linux-mm&m=140552160325228&w=2 but there is no
> >   clear consensus and the discussion has died later on. I feel it would
> >   be more productive to sit together and come up with a reasonable
> >   compromise between - let's start from the begining and keep useful and
> >   reasonable features.
> >   
> > - kmem accounting is seeing a lot of activity mainly thanks to Vladimir.
> >   He is basically the only active developer in this area. I would be
> >   happy if he can attend as well and discuss his future plans in the
> >   area. The work overlaps with slab allocators and slab shrinkers so
> >   having people familiar with these areas would be more than welcome
> 
> One more memcg related topic that is worth discussing IMO:
> 
>  - On global memory pressure we walk over all memory cgroups and scan
>    pages from each of them. Since there can be hundreds or even
>    thousands of memory cgroups, such a walk can be quite expensive,
>    especially if the cgroups are small so that to reclaim anything from
>    them we have to descend to a lower scan priority.

     We do not get to lower priorities just to scan small cgroups. They
     will simply get ignored unless we are force scanning them.

>    The problem is
>    augmented by offline memory cgroups, which now can be dangling for
>    indefinitely long time.

OK, but shrink_lruvec shouldn't do too much work on a memcg which
doesn't have any pages to scan for the given priority. Or have you seen
this in some profiles?

>    That's why I think we should work out a better algorithm for the
>    memory reclaimer. May be, we could rank memory cgroups somehow (by
>    their age, memory consumption?) and try to scan only the top ranked
>    cgroup during a reclaimer run.

We still have to keep some fairness and reclaim all groups
proportionally and balancing this would be quite non-trivial. I am not
saying we couldn't implement our iterators in a more intelligent way but
this code is quite complex already and I haven't seen this as a big
problem yet. Some overhead is to be expected when thousands of groups
are configured, right?

>    This topic is also very close to the
>    soft limit reclaim improvements, which Michal has been working on for
>    a while.

The patches I have for the low limit reclaim didn't care about an
intelligent filtering of non-reclaimable groups because I thought it
would be too early to complicate the code at this stage. Especially when
non-reclaimable will be a very small minority in the real life. This
wasn't the case with the old soft limit because we had opposite
situation there.

Nevertheless I am definitely open to discussing improvements.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>