Re: [PATCH 2/4] mm: memory_hotplug: check hwpoisoned page firstly in do_migrate_range()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2024/8/7 19:14, David Hildenbrand wrote:
> On 07.08.24 09:39, Miaohe Lin wrote:
>> On 2024/8/6 17:29, David Hildenbrand wrote:
>>> On 02.08.24 09:50, Kefeng Wang wrote:
>>>>
>>>>
>>>> On 2024/8/2 4:10, David Hildenbrand wrote:
>>>>>
>>>>>>>
>>>>>>> We're not checking the head page here, will this work reliably for
>>>>>>> hugetlb? (I recall some difference in per-page hwpoison handling between
>>>>>>> hugetlb and THP due to the vmemmap optimization)
>>>>>>
>>>>>> Before this changes, the hwposioned hugetlb page won't try to unmap in
>>>>>> do_migrate_range(), we hope it already unmapped in memory_failure(), as
>>>>>> mentioned from comments, there maybe fail to unmap, so a new safeguard
>>>>>> to try to unmap it again here, but we don't need to guarantee it.
>>>>>
>>>>> Thanks for clarifying!
>>>>>
>>>>> But I do wonder if the PageHWPoison() is the right thing to do for hugetlb.
>>>>>
>>>>> IIUC, hugetlb requires folio_test_hwpoison() -- testing the head page
>>>>> not the subpage. Reason is that due to the vmemmap optimization we might
>>>>> not be able to modify subpages to set hwpoison.
>>>>
>>>> Yes, HVO is special(only head page with hwpoison), since we always want
>>>> to check head page here (next pfn = head_pfn + nr), so it might be
>>>> enough to only use PageHWpoison, but just in case, adding hwpoison check
>>>> for the head page,
>>>>
>>>>      if (unlikely(PageHWPoison(page) || folio_test_hwpoison(folio)))
>>>
>>> I also do wonder if we have to check for large folios folio_test_has_hwpoison():
>>> if any subpage is poisoned, not just the current page.
>>>
>>
>> IMHO, below if condition [1] should be fine to check for any hwpoisoned folio:
>>
>>   if (folio_test_hwpoison(folio) ||
>>        (folio_test_large(folio) && folio_test_has_hwpoisoned(folio))) {
>>
>> 1. For raw pages, folio_test_hwpoison(folio) works fine.
>> 2. For thp (memory_failure fails to split it in first place), folio_test_has_hwpoisoned(folio) works fine.
>> 3. For hugetlb, we always have hwpoison flag set for folio. So folio_test_hwpoison(folio) works fine.

It seems I missed one corner case. When memory_failure meets an isolated thp, get_hwpoison_page() will return EIO and
thp won't have has_hwpoison flag set. Above pattern might not work with it. :(

>>
>> But folio might not be the right hwpoisoned page, i.e. subpages might be hwpoisoned instead.
>> Or am I miss something?
> 
> Yes, but we can only migrate full folios, and if any subpage is poisoned we're in trouble and have to effectively force-unmap it?

Yes, I agree with you.

> 
> At least that's my understanding :)

Thanks.
.






[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux