On Tue, Jun 15, 2021 at 08:57:06PM +0800, Ding Hui wrote: > On 2021/6/14 10:12, Naoya Horiguchi wrote: > > @@ -1956,17 +1938,6 @@ int unpoison_memory(unsigned long pfn) > > goto unlock_mutex; > > } > > - /* > > - * unpoison_memory() can encounter thp only when the thp is being > > - * worked by memory_failure() and the page lock is not held yet. > > - * In such case, we yield to memory_failure() and make unpoison fail. > > - */ > > - if (!PageHuge(page) && PageTransHuge(page)) { > > - unpoison_pr_info("Unpoison: Memory failure is now running on %#lx\n", > > - pfn, &unpoison_rs); > > - goto unlock_mutex; > > - } > > - > > if a huge page is in process of alloc or free, HUGETLB_PAGE_DTOR can be set > after __SetPageHead() or be cleared before __ClearPageHead(), so this > condition may be true in racy. Hi Ding, We confirm PageHWPoison() before reaching this if-block and hwpoisoned pages are prohibited from allocation, so it seems to me that this check never races with hugetlb allocation. And according to the original patch introduced this if-block (0cea3fdc416d: "mm/hwpoison: fix race against poison thp"), this if-block intended to close the race between memory_failure() and unpoison_memory(), so that's no longer necessary due to mf_mutex. > Do we need the racy test for this situation? I'm not sure, but I think that we need more stress/fuzz testing focusing on this subsystem, and "unpoison vs allocation" race can be covered in the topic. Thank you, Naoya Horiguchi