On Tue, May 25, 2021 at 11:09:18AM +0200, Oscar Salvador wrote: > On Tue, May 25, 2021 at 08:07:07AM +0000, HORIGUCHI NAOYA(堀口 直也) wrote: > > OK, here's the current draft. > > > > Thanks, > > Naoya Horiguchi > > > > --- > > From: Naoya Horiguchi <naoya.horiguchi@xxxxxxx> > > Date: Tue, 18 May 2021 23:49:18 +0900 > > Subject: [PATCH] mm,hwpoison: fix race with hugetlb page allocation > > > > When hugetlb page fault (under overcommitting situation) and > > memory_failure() race, VM_BUG_ON_PAGE() is triggered by the following race: > > > > CPU0: CPU1: > > > > gather_surplus_pages() > > page = alloc_surplus_huge_page() > > memory_failure_hugetlb() > > get_hwpoison_page(page) > > __get_hwpoison_page(page) > > get_page_unless_zero(page) > > zero = put_page_testzero(page) > > VM_BUG_ON_PAGE(!zero, page) > > enqueue_huge_page(h, page) > > put_page(page) > > > > __get_hwpoison_page() only checks the page refcount before taking an > > additional one for memory error handling, which is wrong because there's > > a time window where compound pages have non-zero refcount during > > initialization. So make __get_hwpoison_page() check page status a bit > > more for hugetlb pages. > > I think that this changelog would benefit from some information about the new > !PageLRU && !__PageMovable check. OK, I'll update about this check. > > static int __get_hwpoison_page(struct page *page) > > { > > struct page *head = compound_head(page); > > + int ret = 0; > > + bool hugetlb = false; > > + > > + ret = get_hwpoison_huge_page(head, &hugetlb); > > + if (hugetlb) > > + return ret; > > + > > + if (!PageLRU(head) && !__PageMovable(head)) > > + return 0; > > This definitely needs a comment hinting the reader why we need to check for this. > AFAICS, this is to close the race where a page is about to be a hugetlb page soon, > so we do not go for get_page_unless_zero(), right? Right, I can't find any other reliable check to show that a page is not under initialization of hugetlb pages. I'll state why this check is needed. > > From soft_offline_page's POV I __guess__ that's fine because we only deal with > pages we know about. > But what about memory_failure()? I think memory_failure() is less picky about that, > so it is okay to not take a refcount on that case? Actually the coverage of page types for which memory error can be handled are the same between memory_failure() and soft_offline_page(). One memory_failure-specific role is to print out logs about error events even if they can't be successfully handled, which could be affected by this change, so we might need to adjust how memory_failure() uses the return value of get_hwpoison_page(). Thanks, Naoya Horiguchi