On 07.08.24 09:39, Miaohe Lin wrote:
On 2024/8/6 17:29, David Hildenbrand wrote:
On 02.08.24 09:50, Kefeng Wang wrote:
On 2024/8/2 4:10, David Hildenbrand wrote:
We're not checking the head page here, will this work reliably for
hugetlb? (I recall some difference in per-page hwpoison handling between
hugetlb and THP due to the vmemmap optimization)
Before this changes, the hwposioned hugetlb page won't try to unmap in
do_migrate_range(), we hope it already unmapped in memory_failure(), as
mentioned from comments, there maybe fail to unmap, so a new safeguard
to try to unmap it again here, but we don't need to guarantee it.
Thanks for clarifying!
But I do wonder if the PageHWPoison() is the right thing to do for hugetlb.
IIUC, hugetlb requires folio_test_hwpoison() -- testing the head page
not the subpage. Reason is that due to the vmemmap optimization we might
not be able to modify subpages to set hwpoison.
Yes, HVO is special(only head page with hwpoison), since we always want
to check head page here (next pfn = head_pfn + nr), so it might be
enough to only use PageHWpoison, but just in case, adding hwpoison check
for the head page,
if (unlikely(PageHWPoison(page) || folio_test_hwpoison(folio)))
I also do wonder if we have to check for large folios folio_test_has_hwpoison():
if any subpage is poisoned, not just the current page.
IMHO, below if condition [1] should be fine to check for any hwpoisoned folio:
if (folio_test_hwpoison(folio) ||
(folio_test_large(folio) && folio_test_has_hwpoisoned(folio))) {
1. For raw pages, folio_test_hwpoison(folio) works fine.
2. For thp (memory_failure fails to split it in first place), folio_test_has_hwpoisoned(folio) works fine.
3. For hugetlb, we always have hwpoison flag set for folio. So folio_test_hwpoison(folio) works fine.
But folio might not be the right hwpoisoned page, i.e. subpages might be hwpoisoned instead.
Or am I miss something?
Yes, but we can only migrate full folios, and if any subpage is poisoned
we're in trouble and have to effectively force-unmap it?
At least that's my understanding :)
--
Cheers,
David / dhildenb