The patch titled Subject: hugetlb: simplify prep_compound_gigantic_page ref count racing code has been added to the -mm tree. Its filename is hugetlb-simplify-prep_compound_gigantic_page-ref-count-racing-code.patch This patch should soon appear at https://ozlabs.org/~akpm/mmots/broken-out/hugetlb-simplify-prep_compound_gigantic_page-ref-count-racing-code.patch and later at https://ozlabs.org/~akpm/mmotm/broken-out/hugetlb-simplify-prep_compound_gigantic_page-ref-count-racing-code.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Mike Kravetz <mike.kravetz@xxxxxxxxxx> Subject: hugetlb: simplify prep_compound_gigantic_page ref count racing code Patch series "hugetlb: fix potential ref counting races". When Muchun Song brought up a potential issue with hugetlb ref counting[1], I started looking closer at the code. hugetlbfs is the only code with it's own specialized compound page destructor and taking special action when ref counts drop to zero. Potential races happen in this unique handling of ref counts. The following patches address these races when creating and destroying hugetlb pages. These potential races have likely existed since the creation of hugetlbfs. They certainly have been around for more than 10 years. However, I am unaware of anyone actually hitting these races. It is VERY unlikely than anyone will actually hit these races, but they do exist. I could not think of an easy (or difficult) way to force these races. Therefore, testing consisted of adding code to randomly increase ref counts in strategic places. In this way, I was able to exercise all the race handling code paths. [1] https://lore.kernel.org/linux-mm/CAMZfGtVMn3daKrJwZMaVOGOaJU+B4dS--x_oPmGQMD=c=QNGEg@xxxxxxxxxxxxxx/ This patch (of 3): Code in prep_compound_gigantic_page waits for a rcu grace period if it notices a temporarily inflated ref count on a tail page. This was due to the identified potential race with speculative page cache references which could only last for a rcu grace period. This is overly complicated as this situation is VERY unlikely to ever happen. Instead, just quickly return an error. Also, only print a warning in prep_compound_gigantic_page instead of multiple callers. Link: https://lkml.kernel.org/r/20210710002441.167759-1-mike.kravetz@xxxxxxxxxx Link: https://lkml.kernel.org/r/20210710002441.167759-2-mike.kravetz@xxxxxxxxxx Signed-off-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx> Cc: David Hildenbrand <david@xxxxxxxxxx> Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx> Cc: Michal Hocko <mhocko@xxxxxxxx> Cc: Mina Almasry <almasrymina@xxxxxxxxxx> Cc: Muchun Song <songmuchun@xxxxxxxxxxxxx> Cc: Naoya Horiguchi <naoya.horiguchi@xxxxxxxxx> Cc: Oscar Salvador <osalvador@xxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/hugetlb.c | 15 +++++---------- 1 file changed, 5 insertions(+), 10 deletions(-) --- a/mm/hugetlb.c~hugetlb-simplify-prep_compound_gigantic_page-ref-count-racing-code +++ a/mm/hugetlb.c @@ -1657,16 +1657,12 @@ static bool prep_compound_gigantic_page( * cache adding could take a ref on a 'to be' tail page. * We need to respect any increased ref count, and only set * the ref count to zero if count is currently 1. If count - * is not 1, we call synchronize_rcu in the hope that a rcu - * grace period will cause ref count to drop and then retry. - * If count is still inflated on retry we return an error and - * must discard the pages. + * is not 1, we return an error and caller must discard the + * pages. */ if (!page_ref_freeze(p, 1)) { - pr_info("HugeTLB unexpected inflated ref count on freshly allocated page\n"); - synchronize_rcu(); - if (!page_ref_freeze(p, 1)) - goto out_error; + pr_warn("HugeTLB page can not be used due to unexpected inflated ref count\n"); + goto out_error; } set_page_count(p, 0); set_compound_head(p, page); @@ -1830,7 +1826,6 @@ retry: retry = true; goto retry; } - pr_warn("HugeTLB page can not be used due to unexpected inflated ref count\n"); return NULL; } } @@ -2828,8 +2823,8 @@ static void __init gather_bootmem_preall prep_new_huge_page(h, page, page_to_nid(page)); put_page(page); /* add to the hugepage allocator */ } else { + /* VERY unlikely inflated ref count on a tail page */ free_gigantic_page(page, huge_page_order(h)); - pr_warn("HugeTLB page can not be used due to unexpected inflated ref count\n"); } /* _ Patches currently in -mm which might be from mike.kravetz@xxxxxxxxxx are hugetlb-simplify-prep_compound_gigantic_page-ref-count-racing-code.patch hugetlb-drop-ref-count-earlier-after-page-allocation.patch hugetlb-before-freeing-hugetlb-page-set-dtor-to-appropriate-value.patch