Re: File THP and HWPoison

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Mar 18, 2021 at 10:25 AM HORIGUCHI NAOYA(堀口 直也)
<naoya.horiguchi@xxxxxxx> wrote:
>
> On Thu, Mar 18, 2021 at 02:57:16PM +0000, Matthew Wilcox wrote:
> > On Thu, Mar 18, 2021 at 05:08:43PM +0300, Kirill A. Shutemov wrote:
> > > On Tue, Mar 16, 2021 at 02:09:47PM +0000, Matthew Wilcox wrote:
> > > > If we get a memory failure in the middle of a file THP, I think we handle
> > > > it poorly.
> > > >
> > > > int memory_failure(unsigned long pfn, int flags)
> > > > ...
> > > >         if (TestSetPageHWPoison(p)) {
> > > > ...
> > > >         orig_head = hpage = compound_head(p);
> > > > ...
> > > >         if (PageTransHuge(hpage)) {
> > > >                 if (try_to_split_thp_page(p, "Memory Failure") < 0) {
> > > >                         action_result(pfn, MF_MSG_UNSPLIT_THP, MF_IGNORED);
> > > >                         return -EBUSY;
> > > >                 }
> > > >
> > > > static int try_to_split_thp_page(struct page *page, const char *msg)
> > > > {
> > > >         lock_page(page);
> > > >         if (!PageAnon(page) || unlikely(split_huge_page(page))) {
> > > >                 unsigned long pfn = page_to_pfn(page);
> > > >
> > > >                 unlock_page(page);
> > > >                 if (!PageAnon(page))
> > > >                         pr_info("%s: %#lx: non anonymous thp\n", msg, pfn);
> > > >                 else
> > > >                         pr_info("%s: %#lx: thp split failed\n", msg, pfn);
> > > >                 put_page(page);
> > > >                 return -EBUSY;
> > > >
> > > > So (for some reason) we don't even try to split a file THP.  But then,
> > > > if we take a page fault on a file THP:
> > > >
> > > > static struct page *next_uptodate_page(struct page *page,
> > > > ...
> > > >                 if (PageHWPoison(page))
> > > >                         goto skip;
> > > > (... but we're only testing the head page here, which isn't necessarily
> > > > the one which got the error ...)
> > > >
> > > >         if (pmd_none(*vmf->pmd) && PageTransHuge(page)) {
> > > >             vm_fault_t ret = do_set_pmd(vmf, page);
> > > >
> > > > So we now map the PMD-sized page into userspace, even though it has a
> > > > HWPoison in it.
> > > >
> > > > I think there are two things that we should be doing:
> > > >
> > > > 1. Attempt to split THPs which are file-backed.  That makes most of this
> > > > problem disappear because there won't be THPs with HWPoison, mostly.
> > >
> > > +Naoya. Could you give more context here?
>
> Recently, I tried to address the problem on
> https://lore.kernel.org/linux-mm/20210209062128.453814-1-nao.horiguchi@xxxxxxxxx/
> but the patch was found incorrect because the related page table entries disappeared
> after split_huge_page() succeeded.  I thought I'm going to study more, but
> didn't make it this week because I looked at other review requests.
>
> A pmd mapping for anonymous thp is replaced with 512 pte mappings by
> split_huge_page(), so I'm wondering why we don't do the same for shmem thp.

We have to install 512 ptes for anonymous pages otherwise they are
gone. But it is unnecessary for page cache pages, even though the ptes
are gone they are still in xarray and can be found by page fault
later.

IMHO we should be able to pass in one more parameter to teach
split_huge_page to reinstall ptes for page cache for some cases.

>
> >
> > I did some git archaeology and found this check was introduced in
> > 7f6bf39bbdd1 ("mm/hwpoison: fix panic due to split huge zero page") where
> > it wasn't intended to catch _file_ pages at all, but the zero page.
> > I suspect that nobody thought to look at this when introducing THP
> > for shmem.
>
> Yes, 7f6bf39bbdd1 was worked before thp page cache, so we did not consider
> it at that time.
>
> Thanks,
> Naoya Horiguchi
>
> >
> > > > 2. When the THP fails to split, use a spare page flag to indicate that
> > > > the THP contains a HWPoison bit in one of its subpages.  There are a
> > > > lot of PF_SECOND flags available for this purpose.
> > > >
> > > > but I know almost nothing about the memory-failure subsystem and I'm
> > > > still learning all the complexities of THPs, so it's entirely possible
> > > > I've overlooked something important.
> > >
> > > I wounder if it would be cleaner to switch PG_hwpoison to PF_HEAD: if
> > > split failed we posion whole compound page. Yes, we will waste more
> > > memory, but it makes it much cleaner for user: just check if the page is
> > > poisoned.
> >
> > I think that's a poor quality implementation ... it'd cause processes
> > to die that weren't even touching the page that had hwpoison.  Using
> > a PF_SECOND bit lets us do the check as cheaply as if we made hwpoison
> > PF_HEAD.





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux