The patch titled Subject: mm,hwpoison: try to narrow window race for free pages has been added to the -mm tree. Its filename is mmhwpoison-try-to-narrow-window-race-for-free-pages.patch This patch should soon appear at https://ozlabs.org/~akpm/mmots/broken-out/mmhwpoison-try-to-narrow-window-race-for-free-pages.patch and later at https://ozlabs.org/~akpm/mmotm/broken-out/mmhwpoison-try-to-narrow-window-race-for-free-pages.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Oscar Salvador <osalvador@xxxxxxx> Subject: mm,hwpoison: try to narrow window race for free pages Aristeu Rozanski reported that a customer test case started to report -EBUSY after the hwpoison rework patchset. There is a race window between spotting a free page and taking it off its buddy freelist, so it might be that by the time we try to take it off, the page has been already allocated. This patch tries to handle such race window by trying to handle the new type of page again if the page was allocated under us. Link: https://lkml.kernel.org/r/20200922135650.1634-15-osalvador@xxxxxxx Signed-off-by: Oscar Salvador <osalvador@xxxxxxx> Reported-by: Aristeu Rozanski <aris@xxxxxxxxx> Tested-by: Aristeu Rozanski <aris@xxxxxxxxx> Cc: "Aneesh Kumar K.V" <aneesh.kumar@xxxxxxxxxxxxx> Cc: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxxxxxxx> Cc: Dave Hansen <dave.hansen@xxxxxxxxx> Cc: David Hildenbrand <david@xxxxxxxxxx> Cc: Dmitry Yakunin <zeil@xxxxxxxxxxxxxx> Cc: Michal Hocko <mhocko@xxxxxxxxxx> Cc: Mike Kravetz <mike.kravetz@xxxxxxxxxx> Cc: Naoya Horiguchi <naoya.horiguchi@xxxxxxx> Cc: Oscar Salvador <osalvador@xxxxxxxx> Cc: Qian Cai <cai@xxxxxx> Cc: Tony Luck <tony.luck@xxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/memory-failure.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) --- a/mm/memory-failure.c~mmhwpoison-try-to-narrow-window-race-for-free-pages +++ a/mm/memory-failure.c @@ -1903,6 +1903,7 @@ int soft_offline_page(unsigned long pfn, { int ret; struct page *page; + bool try_again = true; if (!pfn_valid(pfn)) return -ENXIO; @@ -1918,6 +1919,7 @@ int soft_offline_page(unsigned long pfn, return 0; } +retry: get_online_mems(); ret = get_any_page(page, pfn, flags); put_online_mems(); @@ -1925,7 +1927,10 @@ int soft_offline_page(unsigned long pfn, if (ret > 0) ret = soft_offline_in_use_page(page); else if (ret == 0) - ret = soft_offline_free_page(page); + if (soft_offline_free_page(page) && try_again) { + try_again = false; + goto retry; + } return ret; } _ Patches currently in -mm which might be from osalvador@xxxxxxx are mmhwpoison-unexport-get_hwpoison_page-and-make-it-static.patch mmhwpoison-refactor-madvise_inject_error.patch mmhwpoison-kill-put_hwpoison_page.patch mmhwpoison-unify-thp-handling-for-hard-and-soft-offline.patch mmhwpoison-rework-soft-offline-for-free-pages.patch mmhwpoison-rework-soft-offline-for-in-use-pages.patch mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch mmhwpoison-return-0-if-the-page-is-already-poisoned-in-soft-offline.patch mmhwpoison-try-to-narrow-window-race-for-free-pages.patch mmhwpoison-take-free-pages-off-the-buddy-freelists.patch mmhwpoison-drain-pcplists-before-bailing-out-for-non-buddy-zero-refcount-page.patch mmhwpoison-drop-unneeded-pcplist-draining.patch mmhwpoison-remove-stale-code.patch