The patch titled Subject: mm: allocate kernel pages to the right memcg has been added to the -mm tree. Its filename is mm-allocate-kernel-pages-to-the-right-memcg.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/SubmitChecklist when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Glauber Costa <glommer@xxxxxxxxxxxxx> Subject: mm: allocate kernel pages to the right memcg When a process tries to allocate a page with the __GFP_KMEMCG flag, the page allocator will call the corresponding memcg functions to validate the allocation. Tasks in the root memcg can always proceed. To avoid adding markers to the page - and a kmem flag that would necessarily follow, as much as doing page_cgroup lookups for no reason, whoever is marking its allocations with __GFP_KMEMCG flag is responsible for telling the page allocator that this is such an allocation at free_pages() time. This is done by the invocation of __free_accounted_pages() and free_accounted_pages(). Signed-off-by: Glauber Costa <glommer@xxxxxxxxxxxxx> Acked-by: Michal Hocko <mhocko@xxxxxxx> Acked-by: Mel Gorman <mgorman@xxxxxxx> Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> Acked-by: David Rientjes <rientjes@xxxxxxxxxx> Cc: Christoph Lameter <cl@xxxxxxxxx> Cc: Pekka Enberg <penberg@xxxxxxxxxxxxxx> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> Cc: Suleiman Souhlal <suleiman@xxxxxxxxxx> Cc: Tejun Heo <tj@xxxxxxxxxx> Cc: Frederic Weisbecker <fweisbec@xxxxxxxxxx> Cc: Greg Thelen <gthelen@xxxxxxxxxx> Cc: JoonSoo Kim <js1304@xxxxxxxxx> Cc: Rik van Riel <riel@xxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- include/linux/gfp.h | 3 +++ mm/page_alloc.c | 35 +++++++++++++++++++++++++++++++++++ 2 files changed, 38 insertions(+) diff -puN include/linux/gfp.h~mm-allocate-kernel-pages-to-the-right-memcg include/linux/gfp.h --- a/include/linux/gfp.h~mm-allocate-kernel-pages-to-the-right-memcg +++ a/include/linux/gfp.h @@ -362,6 +362,9 @@ extern void free_pages(unsigned long add extern void free_hot_cold_page(struct page *page, int cold); extern void free_hot_cold_page_list(struct list_head *list, int cold); +extern void __free_memcg_kmem_pages(struct page *page, unsigned int order); +extern void free_memcg_kmem_pages(unsigned long addr, unsigned int order); + #define __free_page(page) __free_pages((page), 0) #define free_page(addr) free_pages((addr), 0) diff -puN mm/page_alloc.c~mm-allocate-kernel-pages-to-the-right-memcg mm/page_alloc.c --- a/mm/page_alloc.c~mm-allocate-kernel-pages-to-the-right-memcg +++ a/mm/page_alloc.c @@ -2598,6 +2598,7 @@ __alloc_pages_nodemask(gfp_t gfp_mask, u int migratetype = allocflags_to_migratetype(gfp_mask); unsigned int cpuset_mems_cookie; int alloc_flags = ALLOC_WMARK_LOW|ALLOC_CPUSET; + struct mem_cgroup *memcg = NULL; gfp_mask &= gfp_allowed_mask; @@ -2616,6 +2617,13 @@ __alloc_pages_nodemask(gfp_t gfp_mask, u if (unlikely(!zonelist->_zonerefs->zone)) return NULL; + /* + * Will only have any effect when __GFP_KMEMCG is set. This is + * verified in the (always inline) callee + */ + if (!memcg_kmem_newpage_charge(gfp_mask, &memcg, order)) + return NULL; + retry_cpuset: cpuset_mems_cookie = get_mems_allowed(); @@ -2651,6 +2659,8 @@ out: if (unlikely(!put_mems_allowed(cpuset_mems_cookie) && !page)) goto retry_cpuset; + memcg_kmem_commit_charge(page, memcg, order); + return page; } EXPORT_SYMBOL(__alloc_pages_nodemask); @@ -2703,6 +2713,31 @@ void free_pages(unsigned long addr, unsi EXPORT_SYMBOL(free_pages); +/* + * __free_memcg_kmem_pages and free_memcg_kmem_pages will free + * pages allocated with __GFP_KMEMCG. + * + * Those pages are accounted to a particular memcg, embedded in the + * corresponding page_cgroup. To avoid adding a hit in the allocator to search + * for that information only to find out that it is NULL for users who have no + * interest in that whatsoever, we provide these functions. + * + * The caller knows better which flags it relies on. + */ +void __free_memcg_kmem_pages(struct page *page, unsigned int order) +{ + memcg_kmem_uncharge_pages(page, order); + __free_pages(page, order); +} + +void free_memcg_kmem_pages(unsigned long addr, unsigned int order) +{ + if (addr != 0) { + VM_BUG_ON(!virt_addr_valid((void *)addr)); + __free_memcg_kmem_pages(virt_to_page((void *)addr), order); + } +} + static void *make_alloc_exact(unsigned long addr, unsigned order, size_t size) { if (addr) { _ Patches currently in -mm which might be from glommer@xxxxxxxxxxxxx are linux-next.patch memcg-make-it-possible-to-use-the-stock-for-more-than-one-page.patch memcg-reclaim-when-more-than-one-page-needed.patch memcg-change-defines-to-an-enum.patch memcg-kmem-accounting-basic-infrastructure.patch mm-add-a-__gfp_kmemcg-flag.patch memcg-kmem-controller-infrastructure.patch mm-allocate-kernel-pages-to-the-right-memcg.patch res_counter-return-amount-of-charges-after-res_counter_uncharge.patch memcg-kmem-accounting-lifecycle-management.patch memcg-use-static-branches-when-code-not-in-use.patch memcg-allow-a-memcg-with-kmem-charges-to-be-destructed.patch memcg-execute-the-whole-memcg-freeing-in-free_worker.patch fork-protect-architectures-where-thread_size-=-page_size-against-fork-bombs.patch memcg-add-documentation-about-the-kmem-controller.patch slab-slub-struct-memcg_params.patch slab-annotate-on-slab-caches-nodelist-locks.patch slab-slub-consider-a-memcg-parameter-in-kmem_create_cache.patch memcg-allocate-memory-for-memcg-caches-whenever-a-new-memcg-appears.patch memcg-infrastructure-to-match-an-allocation-to-the-right-cache.patch memcg-skip-memcg-kmem-allocations-in-specified-code-regions.patch slb-always-get-the-cache-from-its-page-in-kmem_cache_free.patch slb-allocate-objects-from-memcg-cache.patch memcg-destroy-memcg-caches.patch memcg-slb-track-all-the-memcg-children-of-a-kmem_cache.patch memcg-slb-shrink-dead-caches.patch memcg-aggregate-memcg-cache-values-in-slabinfo.patch slab-propagate-tunable-values.patch slub-slub-specific-propagation-changes.patch kmem-add-slab-specific-documentation-about-the-kmem-controller.patch -- To unsubscribe from this list: send the line "unsubscribe mm-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html