Re: [PATCH] mm,hwpoison: return -EBUSY when page already poisoned

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Feb 25, 2021 at 12:39:30PM +0100, Oscar Salvador wrote:
> On Thu, Feb 25, 2021 at 11:28:18AM +0000, HORIGUCHI NAOYA(堀口 直也) wrote:
> > Hi Aili,
> > 
> > I agree that this set_mce_nospec() is not expected to be called for
> > "already hwpoisoned" page because in the reported case the error
> > page is already contained and no need to resort changing cache mode.
> 
> Out of curiosity, what is the current behavour now?
> Say we have an ongoing MCE which has marked the page as HWPoison but
> memory_failure did not take any action on the page yet.
> And then, we have another MCE, which ends up there.
> set_mce_nospec might clear _PAGE_PRESENT bit.
> 
> Does that have any impact on the first MCE?

Hi Oscar,

Thank you for shedding light on this, this race looks worrisome to me.
We call try_to_unmap() inside memory_failure(), where we find affected
ptes by page_vma_mapped_walk() and convert into hwpoison entires in
try_to_unmap_one().  So there seems two racy cases:

  1)
     CPU 0                          CPU 1
     page_vma_mapped_walk
                                    clear _PAGE_PRESENT bit
       // skipped the entry

  2)
     CPU 0                          CPU 1
     page_vma_mapped_walk
       try_to_unmap_one
                                    clear _PAGE_PRESENT bit
         convert the entry
         set_pte_at

In case 1, the affected processes get signals on later access,
so although the info in SIGBUS could be different, that's OK.
And we have no impact in case 2.

Thanks,
Naoya Horiguchi




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux