On Wed, Sep 22, 2021 at 08:28:26PM -0700, Yang Shi wrote: > When handling shmem page fault the THP with corrupted subpage could be PMD > mapped if certain conditions are satisfied. But kernel is supposed to > send SIGBUS when trying to map hwpoisoned page. > > There are two paths which may do PMD map: fault around and regular fault. > > Before commit f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths") > the thing was even worse in fault around path. The THP could be PMD mapped as > long as the VMA fits regardless what subpage is accessed and corrupted. After > this commit as long as head page is not corrupted the THP could be PMD mapped. > > In the regulat fault path the THP could be PMD mapped as long as the corrupted s/regulat/regular/ > page is not accessed and the VMA fits. > > This loophole could be fixed by iterating every subpage to check if any > of them is hwpoisoned or not, but it is somewhat costly in page fault path. > > So introduce a new page flag called HasHWPoisoned on the first tail page. It > indicates the THP has hwpoisoned subpage(s). It is set if any subpage of THP > is found hwpoisoned by memory failure and cleared when the THP is freed or > split. > > Cc: <stable@xxxxxxxxxxxxxxx> > Suggested-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> > Signed-off-by: Yang Shi <shy828301@xxxxxxxxx> > --- ... > diff --git a/mm/filemap.c b/mm/filemap.c > index dae481293b5d..740b7afe159a 100644 > --- a/mm/filemap.c > +++ b/mm/filemap.c > @@ -3195,12 +3195,14 @@ static bool filemap_map_pmd(struct vm_fault *vmf, struct page *page) > } > > if (pmd_none(*vmf->pmd) && PageTransHuge(page)) { > - vm_fault_t ret = do_set_pmd(vmf, page); > - if (!ret) { > - /* The page is mapped successfully, reference consumed. */ > - unlock_page(page); > - return true; > - } > + vm_fault_t ret = do_set_pmd(vmf, page); > + if (ret == VM_FAULT_FALLBACK) > + goto out; Hm.. What? I don't get it. Who will establish page table in the pmd then? > + if (!ret) { > + /* The page is mapped successfully, reference consumed. */ > + unlock_page(page); > + return true; > + } > } > > if (pmd_none(*vmf->pmd)) { > @@ -3220,6 +3222,7 @@ static bool filemap_map_pmd(struct vm_fault *vmf, struct page *page) > return true; > } > > +out: > return false; > } > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 5e9ef0fc261e..0574b1613714 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -2426,6 +2426,8 @@ static void __split_huge_page(struct page *page, struct list_head *list, > /* lock lru list/PageCompound, ref frozen by page_ref_freeze */ > lruvec = lock_page_lruvec(head); > > + ClearPageHasHWPoisoned(head); > + Do we serialize the new flag with lock_page() or what? I mean what prevents the flag being set again after this point, but before ClearPageCompound()? -- Kirill A. Shutemov