The patch titled hugetlb: fix pool shrinking while in restricted cpuset has been added to the -mm tree. Its filename is hugetlb-fix-pool-shrinking-while-in-restricted-cpuset.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/SubmitChecklist when testing your code *** See http://www.zip.com.au/~akpm/linux/patches/stuff/added-to-mm.txt to find out what to do about this The current -mm tree may be found at http://userweb.kernel.org/~akpm/mmotm/ ------------------------------------------------------ Subject: hugetlb: fix pool shrinking while in restricted cpuset From: Nishanth Aravamudan <nacc@xxxxxxxxxx> Adam Litke noticed that currently we grow the hugepage pool independent of any cpuset the running process may be in, but when shrinking the pool, the cpuset is checked. This leads to inconsistency when shrinking the pool in a restricted cpuset -- an administrator may have been able to grow the pool on a node restricted by a containing cpuset, but they cannot shrink it there. There are two options: either prevent growing of the pool outside of the cpuset or allow shrinking outside of the cpuset. >From previous discussions on linux-mm, /proc/sys/vm/nr_hugepages is an administrative interface that should not be restricted by cpusets. So allow shrinking the pool by removing pages from nodes outside of current's cpuset. Signed-off-by: Nishanth Aravamudan <nacc@xxxxxxxxxx> Cc: Adam Litke <agl@xxxxxxxxxx> Cc: William Irwin <wli@xxxxxxxxxxxxxx> Cc: Lee Schermerhorn <Lee.Schermerhonr@xxxxxx> Cc: Christoph Lameter <clameter@xxxxxxx> Cc: Paul Jackson <pj@xxxxxxx> Cc: David Gibson <david@xxxxxxxxxxxxxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/hugetlb.c | 26 ++++++++++++++++++++++---- 1 file changed, 22 insertions(+), 4 deletions(-) diff -puN mm/hugetlb.c~hugetlb-fix-pool-shrinking-while-in-restricted-cpuset mm/hugetlb.c --- a/mm/hugetlb.c~hugetlb-fix-pool-shrinking-while-in-restricted-cpuset +++ a/mm/hugetlb.c @@ -71,7 +71,25 @@ static void enqueue_huge_page(struct pag free_huge_pages_node[nid]++; } -static struct page *dequeue_huge_page(struct vm_area_struct *vma, +static struct page *dequeue_huge_page(void) +{ + int nid; + struct page *page = NULL; + + for (nid = 0; nid < MAX_NUMNODES; ++nid) { + if (!list_empty(&hugepage_freelists[nid])) { + page = list_entry(hugepage_freelists[nid].next, + struct page, lru); + list_del(&page->lru); + free_huge_pages--; + free_huge_pages_node[nid]--; + break; + } + } + return page; +} + +static struct page *dequeue_huge_page_vma(struct vm_area_struct *vma, unsigned long address) { int nid; @@ -417,7 +435,7 @@ static struct page *alloc_huge_page_shar struct page *page; spin_lock(&hugetlb_lock); - page = dequeue_huge_page(vma, addr); + page = dequeue_huge_page_vma(vma, addr); spin_unlock(&hugetlb_lock); return page ? page : ERR_PTR(-VM_FAULT_OOM); } @@ -432,7 +450,7 @@ static struct page *alloc_huge_page_priv spin_lock(&hugetlb_lock); if (free_huge_pages > resv_huge_pages) - page = dequeue_huge_page(vma, addr); + page = dequeue_huge_page_vma(vma, addr); spin_unlock(&hugetlb_lock); if (!page) { page = alloc_buddy_huge_page(vma, addr); @@ -585,7 +603,7 @@ static unsigned long set_max_huge_pages( min_count = max(count, min_count); try_to_free_low(min_count); while (min_count < persistent_huge_pages) { - struct page *page = dequeue_huge_page(NULL, 0); + struct page *page = dequeue_huge_page(); if (!page) break; update_and_free_page(page); _ Patches currently in -mm which might be from nacc@xxxxxxxxxx are hugetlb-fix-pool-shrinking-while-in-restricted-cpuset.patch -- To unsubscribe from this list: send the line "unsubscribe mm-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html