On 3/11/22 23:46, Miaohe Lin wrote: > There is a race window where we got the compound_head, the hugetlb page > could be freed to buddy, or even changed to another compound page just > before we try to get hwpoison page. Think about the below race window: > CPU 1 CPU 2 > memory_failure_hugetlb > struct page *head = compound_head(p); > hugetlb page might be freed to > buddy, or even changed to another > compound page. > > get_hwpoison_page -- page is not what we want now... > > If this race happens, just bail out. Also MF_MSG_DIFFERENT_PAGE_SIZE is > introduced to record this event. > > Signed-off-by: Miaohe Lin <linmiaohe@xxxxxxxxxx> > --- > include/linux/mm.h | 1 + > include/ras/ras_event.h | 1 + > mm/memory-failure.c | 12 ++++++++++++ > 3 files changed, 14 insertions(+) > > diff --git a/include/linux/mm.h b/include/linux/mm.h > index c9bada4096ac..ef98cff2b253 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -3253,6 +3253,7 @@ enum mf_action_page_type { > MF_MSG_BUDDY, > MF_MSG_DAX, > MF_MSG_UNSPLIT_THP, > + MF_MSG_DIFFERENT_PAGE_SIZE, > MF_MSG_UNKNOWN, > }; > > diff --git a/include/ras/ras_event.h b/include/ras/ras_event.h > index d0337a41141c..1e694fd239b9 100644 > --- a/include/ras/ras_event.h > +++ b/include/ras/ras_event.h > @@ -374,6 +374,7 @@ TRACE_EVENT(aer_event, > EM ( MF_MSG_BUDDY, "free buddy page" ) \ > EM ( MF_MSG_DAX, "dax page" ) \ > EM ( MF_MSG_UNSPLIT_THP, "unsplit thp" ) \ > + EM ( MF_MSG_DIFFERENT_PAGE_SIZE, "different page size" ) \ > EMe ( MF_MSG_UNKNOWN, "unknown page" ) > > /* > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > index 5444a8ef4867..dabecd87ad3f 100644 > --- a/mm/memory-failure.c > +++ b/mm/memory-failure.c > @@ -733,6 +733,7 @@ static const char * const action_page_types[] = { > [MF_MSG_BUDDY] = "free buddy page", > [MF_MSG_DAX] = "dax page", > [MF_MSG_UNSPLIT_THP] = "unsplit thp", > + [MF_MSG_DIFFERENT_PAGE_SIZE] = "different page size", > [MF_MSG_UNKNOWN] = "unknown page", > }; > > @@ -1534,6 +1535,17 @@ static int memory_failure_hugetlb(unsigned long pfn, int flags) > } > > lock_page(head); > + > + /** > + * The page could have changed compound pages due to race window. > + * If this happens just bail out. > + */ > + if (!PageHuge(p) || compound_head(p) != head) { > + action_result(pfn, MF_MSG_DIFFERENT_PAGE_SIZE, MF_IGNORED); > + res = -EBUSY; We have discussed this race in other versions of the patch. When we encounter the race, we have likely marked poison on the wrong page. Correct? Instead of printing a "different page size", would it be better to perhaps: - Print a message that wrong page may be marked for poison? - Clear the poison flag in the "head page" previously set? -- Mike Kravetz > + goto out; > + } > + > page_flags = head->flags; > > if (hwpoison_filter(p)) {