On Thu, Sep 30, 2021 at 02:53:08PM -0700, Yang Shi wrote: > @@ -1148,8 +1148,12 @@ static int __get_hwpoison_page(struct page *page) > return -EBUSY; > > if (get_page_unless_zero(head)) { > - if (head == compound_head(page)) > + if (head == compound_head(page)) { > + if (PageTransHuge(head)) > + SetPageHasHWPoisoned(head); > + > return 1; > + } > > pr_info("Memory failure: %#lx cannot catch tail\n", > page_to_pfn(page)); Sorry for the late comments. I'm wondering whether it's ideal to set this bit here, as get_hwpoison_page() sounds like a pure helper to get a refcount out of a sane hwpoisoned page. I'm afraid there can be side effect that we set this without being noticed, so I'm also wondering we should keep it in memory_failure(). Quotting comments for get_hwpoison_page(): * get_hwpoison_page() takes a page refcount of an error page to handle memory * error on it, after checking that the error page is in a well-defined state * (defined as a page-type we can successfully handle the memor error on it, * such as LRU page and hugetlb page). For example, I see that both unpoison_memory() and soft_offline_page() will call it too, does it mean that we'll also set the bits e.g. even when we want to inject an unpoison event too? Thanks, -- Peter Xu