The patch titled Subject: memcg: move charges to root cgroup if use_hierarchy=0. has been added to the -mm tree. Its filename is memcg-move-charges-to-root-cgroup-if-use_hierarchy=0.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/SubmitChecklist when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> Subject: memcg: move charges to root cgroup if use_hierarchy=0. Presently, at removal of cgroup, ->pre_destroy() is called and moves charges to the parent cgroup. A major reason for returning -EBUSY from ->pre_destroy() is that the 'moving' hits the parent's resource limitation. It happens only when use_hierarchy=0. Considering use_hierarchy=0, all cgroups should be flat. So, no one cannot justify moving charges to parent...parent and children are in flat configuration, not hierarchical. This patch modifes the code to move charges to the root cgroup at rmdir/force_empty if use_hierarchy==0. This will much simplify rmdir() and reduce error in ->pre_destroy. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> Cc: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxxxxxxx> Cc: Michal Hocko <mhocko@xxxxxxx> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> Cc: Frederic Weisbecker <fweisbec@xxxxxxxxx> Cc: Ying Han <yinghan@xxxxxxxxxx> Cc: Glauber Costa <glommer@xxxxxxxxxxxxx> Cc: Tejun Heo <tj@xxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- Documentation/cgroups/memory.txt | 13 ++++--- mm/memcontrol.c | 49 ++++++++++------------------- 2 files changed, 25 insertions(+), 37 deletions(-) diff -puN Documentation/cgroups/memory.txt~memcg-move-charges-to-root-cgroup-if-use_hierarchy=0 Documentation/cgroups/memory.txt --- a/Documentation/cgroups/memory.txt~memcg-move-charges-to-root-cgroup-if-use_hierarchy=0 +++ a/Documentation/cgroups/memory.txt @@ -393,14 +393,15 @@ cgroup might have some charge associated tasks have migrated away from it. (because we charge against pages, not against tasks.) -Such charges are freed or moved to their parent. At moving, both of RSS -and CACHES are moved to parent. -rmdir() may return -EBUSY if freeing/moving fails. See 5.1 also. +We move the stats to root (if use_hierarchy==0) or parent (if +use_hierarchy==1), and no change on the charge except uncharging +from the child. Charges recorded in swap information is not updated at removal of cgroup. Recorded information is discarded and a cgroup which uses swap (swapcache) will be charged as a new owner of it. +About use_hierarchy, see Section 6. 5. Misc. interfaces. @@ -413,13 +414,15 @@ will be charged as a new owner of it. Almost all pages tracked by this memory cgroup will be unmapped and freed. Some pages cannot be freed because they are locked or in-use. Such pages are - moved to parent and this cgroup will be empty. This may return -EBUSY if - VM is too busy to free/move all pages immediately. + moved to parent(if use_hierarchy==1) or root (if use_hierarchy==0) and this + cgroup will be empty. Typical use case of this interface is that calling this before rmdir(). Because rmdir() moves all pages to parent, some out-of-use page caches can be moved to the parent. If you want to avoid that, force_empty will be useful. + About use_hierarchy, see Section 6. + 5.2 stat file memory.stat file includes following statistics diff -puN mm/memcontrol.c~memcg-move-charges-to-root-cgroup-if-use_hierarchy=0 mm/memcontrol.c --- a/mm/memcontrol.c~memcg-move-charges-to-root-cgroup-if-use_hierarchy=0 +++ a/mm/memcontrol.c @@ -2709,15 +2709,13 @@ static int mem_cgroup_move_parent(struct struct mem_cgroup *child, gfp_t gfp_mask) { - struct cgroup *cg = child->css.cgroup; - struct cgroup *pcg = cg->parent; struct mem_cgroup *parent; unsigned int nr_pages; unsigned long uninitialized_var(flags); int ret; /* Is ROOT ? */ - if (!pcg) + if (mem_cgroup_is_root(child)) return -EINVAL; ret = -EBUSY; @@ -2728,33 +2726,23 @@ static int mem_cgroup_move_parent(struct nr_pages = hpage_nr_pages(page); - parent = mem_cgroup_from_cont(pcg); - if (!parent->use_hierarchy) { - ret = __mem_cgroup_try_charge(NULL, - gfp_mask, nr_pages, &parent, false); - if (ret) - goto put_back; - } + parent = parent_mem_cgroup(child); + /* + * If no parent, move charges to root cgroup. + */ + if (!parent) + parent = root_mem_cgroup; if (nr_pages > 1) flags = compound_lock_irqsave(page); - if (parent->use_hierarchy) { - ret = mem_cgroup_move_account(page, nr_pages, - pc, child, parent, false); - if (!ret) - __mem_cgroup_cancel_local_charge(child, nr_pages); - } else { - ret = mem_cgroup_move_account(page, nr_pages, - pc, child, parent, true); - - if (ret) - __mem_cgroup_cancel_charge(parent, nr_pages); - } + ret = mem_cgroup_move_account(page, nr_pages, + pc, child, parent, false); + if (!ret) + __mem_cgroup_cancel_local_charge(child, nr_pages); if (nr_pages > 1) compound_unlock_irqrestore(page, flags); -put_back: putback_lru_page(page); put: put_page(page); @@ -3351,9 +3339,8 @@ int mem_cgroup_move_hugetlb_parent(int i struct page_cgroup *pc; int csize, ret = 0; struct res_counter *fail_res; - struct cgroup *pcgrp = cgroup->parent; - struct mem_cgroup *parent = mem_cgroup_from_cont(pcgrp); struct mem_cgroup *memcg = mem_cgroup_from_cont(cgroup); + struct mem_cgroup *parent = parent_mem_cgroup(memcg); struct res_counter *counter; if (!get_page_unless_zero(page)) @@ -3366,13 +3353,11 @@ int mem_cgroup_move_hugetlb_parent(int i csize = PAGE_SIZE << compound_order(page); /* If parent->use_hierarchy == 0, we need to charge parent */ - if (!parent->use_hierarchy) { - ret = res_counter_charge(&parent->hugepage[idx], - csize, &fail_res); - if (ret) { - ret = -EBUSY; - goto err_out; - } + if (!parent) { + parent = root_mem_cgroup; + /* root has no limit */ + res_counter_charge_nofail(&parent->hugepage[idx], + csize, &fail_res); } counter = &memcg->hugepage[idx]; res_counter_uncharge_until(counter, counter->parent, csize); _ Subject: Subject: memcg: move charges to root cgroup if use_hierarchy=0. Patches currently in -mm which might be from kamezawa.hiroyu@xxxxxxxxxxxxxx are origin.patch linux-next.patch hugetlb-rename-max_hstate-to-hugetlb_max_hstate.patch hugetlbfs-dont-use-err_ptr-with-vm_fault-values.patch hugetlbfs-add-an-inline-helper-for-finding-hstate-index.patch hugetlb-use-mmu_gather-instead-of-a-temporary-linked-list-for-accumulating-pages.patch hugetlb-use-mmu_gather-instead-of-a-temporary-linked-list-for-accumulating-pages-fix.patch hugetlb-use-mmu_gather-instead-of-a-temporary-linked-list-for-accumulating-pages-fix-fix.patch hugetlb-avoid-taking-i_mmap_mutex-in-unmap_single_vma-for-hugetlb.patch hugetlb-simplify-migrate_huge_page.patch memcg-add-hugetlb-extension.patch memcg-add-hugetlb-extension-fix.patch memcg-add-hugetlb-extension-fix-fix.patch hugetlb-add-charge-uncharge-calls-for-hugetlb-alloc-free.patch memcg-track-resource-index-in-cftype-private.patch hugetlbfs-add-memcg-control-files-for-hugetlbfs.patch hugetlbfs-add-memcg-control-files-for-hugetlbfs-use-scnprintf-instead-of-sprintf.patch hugetlbfs-add-memcg-control-files-for-hugetlbfs-use-scnprintf-instead-of-sprintf-fix.patch hugetlbfs-add-a-list-for-tracking-in-use-hugetlb-pages.patch memcg-move-hugetlb-resource-count-to-parent-cgroup-on-memcg-removal.patch memcg-move-hugetlb-resource-count-to-parent-cgroup-on-memcg-removal-fix.patch memcg-move-hugetlb-resource-count-to-parent-cgroup-on-memcg-removal-fix-fix.patch hugetlb-migrate-memcg-info-from-oldpage-to-new-page-during-migration.patch memcg-add-memory-controller-documentation-for-hugetlb-management.patch mm-mmapc-find_vma-remove-unnecessary-ifmm-check.patch mm-mmapc-find_vma-remove-unnecessary-ifmm-check-fix.patch mm-correctly-synchronize-rss-counters-at-exit-exec.patch thp-memcg-split-hugepage-for-memcg-oom-on-cow.patch mm-do_migrate_pages-calls-migrate_to_node-even-if-task-is-already-on-a-correct-node.patch mm-do_migrate_pages-calls-migrate_to_node-even-if-task-is-already-on-a-correct-node-fix.patch mm-do_migrate_pages-rename-arguments.patch kernel-cgroup-push-rcu-read-locking-from-css_is_ancestor-to-callsite.patch mm-memcg-count-pte-references-from-every-member-of-the-reclaimed-hierarchy.patch mm-thp-drop-page_table_lock-to-uncharge-memcg-pages.patch documentation-memcg-future-proof-hierarchical-statistics-documentation.patch mm-page_allocc-remove-pageblock_default_order.patch memcg-fix-change-behavior-of-shared-anon-at-moving-task.patch memcg-swap-mem_cgroup_move_swap_account-never-needs-fixup.patch memcg-swap-use-mem_cgroup_uncharge_swap.patch mm-memcg-scanning_global_lru-means-mem_cgroup_disabled.patch mm-memcg-move-reclaim_stat-into-lruvec.patch mm-push-lru-index-into-shrink_active_list.patch mm-push-lru-index-into-shrink_active_list-fix.patch mm-mark-mm-inline-functions-as-__always_inline.patch mm-remove-lru-type-checks-from-__isolate_lru_page.patch mm-memcg-kill-mem_cgroup_lru_del.patch memcg-mark-more-functions-variables-as-static.patch memcg-remove-unused-variable.patch memcg-mark-stat-field-of-mem_cgroup-struct-as-__percpu.patch memcg-remove-redundant-parentheses.patch memcg-make-threshold-index-in-the-right-position.patch memcg-revise-the-position-of-threshold-index-while-unregistering-event.patch memcg-add-mlock-statistic-in-memorystat.patch memcg-add-mlock-statistic-in-memorystat-fix.patch mm-vmscan-store-priority-in-struct-scan_control.patch mm-add-link-from-struct-lruvec-to-struct-zone.patch mm-vmscan-push-lruvec-pointer-into-isolate_lru_pages.patch mm-vmscan-push-zone-pointer-into-shrink_page_list.patch mm-vmscan-remove-update_isolated_counts.patch mm-vmscan-push-lruvec-pointer-into-putback_inactive_pages.patch mm-vmscan-replace-zone_nr_lru_pages-with-get_lruvec_size.patch mm-vmscan-push-lruvec-pointer-into-inactive_list_is_low.patch mm-vmscan-push-lruvec-pointer-into-shrink_list.patch mm-vmscan-push-lruvec-pointer-into-get_scan_count.patch mm-vmscan-push-lruvec-pointer-into-should_continue_reclaim.patch mm-vmscan-kill-struct-mem_cgroup_zone.patch memcg-fix-error-code-in-hugetlb_force_memcg_empty.patch rescounters-add-res_counter_uncharge_until.patch memcg-use-res_counter_uncharge_until-in-move_parent.patch memcg-move-charges-to-root-cgroup-if-use_hierarchy=0.patch memcg-dont-uncharge-in-mem_cgroup_move_account.patch remove-__must_check-for-res_counter_charge_nofail.patch fork-call-complete_vfork_done-after-clearing-child_tid-and-flushing-rss-counters.patch fs-proc-introduce-proc-pid-task-tid-children-entry-v9.patch fs-proc-introduce-proc-pid-task-tid-children-entry-v9-fix.patch c-r-procfs-add-arg_start-end-env_start-end-and-exit_code-members-to-proc-pid-stat.patch c-r-prctl-extend-pr_set_mm-to-set-up-more-mm_struct-entries-v2.patch c-r-prctl-simplify-pr_set_mm-on-mm-code-data-assignment.patch c-r-prctl-simplify-pr_set_mm-on-mm-code-data-assignment-fix.patch c-r-prctl-return-efault-instead-of-einval-in-case-if-underlied-vma-is-not-found.patch c-r-prctl-drop-vma-flags-test-on-pr_set_mm_-stack-data-assignment.patch -- To unsubscribe from this list: send the line "unsubscribe mm-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html