On 2021/6/16 8:11, HORIGUCHI NAOYA(堀口 直也) wrote:
On Tue, Jun 15, 2021 at 08:57:06PM +0800, Ding Hui wrote:
On 2021/6/14 10:12, Naoya Horiguchi wrote:
@@ -1956,17 +1938,6 @@ int unpoison_memory(unsigned long pfn)
goto unlock_mutex;
}
- /*
- * unpoison_memory() can encounter thp only when the thp is being
- * worked by memory_failure() and the page lock is not held yet.
- * In such case, we yield to memory_failure() and make unpoison fail.
- */
- if (!PageHuge(page) && PageTransHuge(page)) {
- unpoison_pr_info("Unpoison: Memory failure is now running on %#lx\n",
- pfn, &unpoison_rs);
- goto unlock_mutex;
- }
-
if a huge page is in process of alloc or free, HUGETLB_PAGE_DTOR can be set
after __SetPageHead() or be cleared before __ClearPageHead(), so this
condition may be true in racy.
Hi Ding,
We confirm PageHWPoison() before reaching this if-block and hwpoisoned pages
are prohibited from allocation, so it seems to me that this check never
races with hugetlb allocation.
And according to the original patch introduced this if-block (0cea3fdc416d:
"mm/hwpoison: fix race against poison thp"), this if-block intended to close
the race between memory_failure() and unpoison_memory(), so that's no longer
necessary due to mf_mutex.
I got it and thanks for your explanation.
Do we need the racy test for this situation?
I'm not sure, but I think that we need more stress/fuzz testing focusing on
this subsystem, and "unpoison vs allocation" race can be covered in the topic.
Thank you,
Naoya Horiguchi
--
Thanks,
- Ding Hui