On Sat, Nov 20, 2021 at 9:44 AM Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx> wrote: > > Pages are individually marked as suffering from hardware poisoning. > Checking that the head page is not hardware poisoned doesn't make > sense; we might be after a subpage. We check each page individually > before we use it, so this was an optimisation gone wrong. Yeah, it doesn't make too much sense to check the head page. And it seems the non-poisoned subpages could be PTE mapped instead of skipping the whole THP. Not sure if this is by design, it seems the hwpoisoned check in filemap_map_pages() does skip the subpages after the poisoned page. Or we should just skip the poisoned page itself? If so the below change may be needed: diff --git a/mm/filemap.c b/mm/filemap.c index daa0e23a6ee6..f1f0cb263b4a 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -3318,7 +3318,7 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, do { page = find_subpage(head, xas.xa_index); if (PageHWPoison(page)) - goto unlock; + goto skip; if (mmap_miss > 0) mmap_miss--; @@ -3337,6 +3337,7 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, do_set_pte(vmf, page, addr); /* no need to invalidate: a not-present page won't be cached */ update_mmu_cache(vma, addr, vmf->pte); +skip: unlock_page(head); continue; unlock: > > Signed-off-by: Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx> > --- > mm/filemap.c | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/mm/filemap.c b/mm/filemap.c > index 0b6f996108b4..65973204112d 100644 > --- a/mm/filemap.c > +++ b/mm/filemap.c > @@ -3239,8 +3239,6 @@ static struct page *next_uptodate_page(struct page *page, > goto skip; > if (!PageUptodate(page) || PageReadahead(page)) > goto skip; > - if (PageHWPoison(page)) > - goto skip; > if (!trylock_page(page)) > goto skip; > if (page->mapping != mapping) > -- > 2.33.0 >