Re: [PATCH v4 -mm] rework mem_cgroup iterator

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Andrew,
it seems that there were no other objections[1] to this version of the
patchset. Could you take it into your tree and target it for 3.10?

---
[1] The only one from Johannes (https://lkml.org/lkml/2013/2/8/379) has
been addressed

On Thu 14-02-13 14:26:30, Michal Hocko wrote:
> Hi,
> this is a fourth iteration of the patch series previously posted here:
> http://lkml.org/lkml/2013/1/3/293. I am adding Andrew to the CC as I
> feel this is getting to the acceptable state finally. It still need
> review as some things changed a lot since the last version but we are
> getting there...
> 
> The patch set tries to make mem_cgroup_iter saner in the way how it
> walks hierarchies. css->id based traversal is far from being ideal as
> it is not deterministic because it depends on the creation ordering.
> Additional to that css_id is considered a burden for cgroup maintainers
> because it is quite some code and memcg is the last user of it. After
> this series only the swap accounting uses css_id but that one will
> follow up later.
> 
> Diffstat (if we exclude removed/added comments) looks quite
> promissing. We got rid of some code:
> $ git diff mmotm... | grep -v "^[+-][[:space:]]*[/ ]\*" | diffstat
>  b/include/linux/cgroup.h |    3 ---
>  kernel/cgroup.c          |   33 ---------------------------------
>  mm/memcontrol.c          |    4 +++-
>  3 files changed, 3 insertions(+), 37 deletions(-)
> 
> The first patch is just preparatory and it changes when we release css
> of the previously returned memcg. Nothing controlversial.
> 
> The second patch is the core of the patchset and it replaces css_get_next
> based on css_id by the generic cgroup pre-order. This brings some
> chalanges for the last visited group caching during the reclaim
> (mem_cgroup_per_zone::reclaim_iter). We have to use memcg pointers
> directly now which means that we have to keep a reference to those
> groups' css to keep them alive.
> I also folded iter_lock introduced by https://lkml.org/lkml/2013/1/3/295
> in the previous version into this patch. Johannes felt the race I
> was describing should be mostly harmless and I haven't been able to
> trigger it so the lock doesn't deserve its own patch. It is still needed
> temporarily, though, because the reference counting on iter->last_visited
> depends on it. It will go away with the next patch.
> 
> The next patch fixups an unbounded cgroup removal holdoff caused
> by the elevated css refcount. The issue has been observed by
> Ying Han.  Johannes wasn't impressed by the previous version of the fix
> (https://lkml.org/lkml/2013/2/8/379) which cleaned up pending references
> during mem_cgroup_css_offline when a group is removed. He has suggested
> a different way when the iterator checks whether a cached memcg is
> still valid or no. More on that in the patch but the basic idea is that
> every memcg tracks the number removed subgroups and iterator records
> this number when a group is cached. These numbers are checked before
> iter->last_visited is about to be used and the iteration is restarted if
> it is invalid.
> 
> The fourth and fifth patches are an attempt for simplification of the
> mem_cgroup_iter. css juggling is removed and the iteration logic is
> moved to a helper so that the reference counting and iteration are
> separated.
> 
> The last patch just removes css_get_next as there is no user for it any
> longer.
> 
> I have dropped Acks from patches that needed a rework since the last
> time so please have a look at them again (especially 1-4). I hope I
> haven't screwed anything during the rebase.
> 
> My testing looked as follows:
>         A (use_hierarchy=1, limit_in_bytes=150M)
>        /|\
>       1 2 3
> 
> Children groups were created so that the number is never higher than
> 3 and their limits were random between 50-100M. Each group hosts a
> kernel build (starting with tar -xf so the tree is not shared and make
> -jNUM_CPUs/3) and terminated after random time - up to 5 minutes) and
> then it is removed.
> This should exercise both leaf and hierarchical reclaim as well as
> races with cgroup removals and debugging messages I added on top proved
> that. 100 groups were created during the test.
> 
> Shortlog says:
> Michal Hocko (6):
>       memcg: keep prev's css alive for the whole mem_cgroup_iter
>       memcg: rework mem_cgroup_iter to use cgroup iterators
>       memcg: relax memcg iter caching
>       memcg: simplify mem_cgroup_iter
>       memcg: further simplify mem_cgroup_iter
>       cgroup: remove css_get_next
> 
> Full diffstat says:
>  include/linux/cgroup.h |    7 ---
>  kernel/cgroup.c        |   49 ----------------
>  mm/memcontrol.c        |  145 ++++++++++++++++++++++++++++++++++++++++--------
>  3 files changed, 121 insertions(+), 80 deletions(-)
> 
> Any suggestions are welcome of course.
> 
> Thanks!
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]