Re: [PATCH 2/4] mm: khugepaged: check if file page is on LRU after locking page

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 15, 2021 at 4:00 PM Yang Shi <shy828301@xxxxxxxxx> wrote:
>
> On Wed, Sep 15, 2021 at 10:48 AM Yang Shi <shy828301@xxxxxxxxx> wrote:
> >
> > On Wed, Sep 15, 2021 at 4:49 AM Kirill A. Shutemov <kirill@xxxxxxxxxxxxx> wrote:
> > >
> > > On Tue, Sep 14, 2021 at 11:37:16AM -0700, Yang Shi wrote:
> > > > The khugepaged does check if the page is on LRU or not but it doesn't
> > > > hold page lock.  And it doesn't check this again after holding page
> > > > lock.  So it may race with some others, e.g. reclaimer, migration, etc.
> > > > All of them isolates page from LRU then lock the page then do something.
> > > >
> > > > But it could pass the refcount check done by khugepaged to proceed
> > > > collapse.  Typically such race is not fatal.  But if the page has been
> > > > isolated from LRU before khugepaged it likely means the page may be not
> > > > suitable for collapse for now.
> > > >
> > > > The other more fatal case is the following patch will keep the poisoned
> > > > page in page cache for shmem, so khugepaged may collapse a poisoned page
> > > > since the refcount check could pass.  3 refcounts come from:
> > > >   - hwpoison
> > > >   - page cache
> > > >   - khugepaged
> > > >
> > > > Since it is not on LRU so no refcount is incremented from LRU isolation.
> > > >
> > > > This is definitely not expected.  Checking if it is on LRU or not after
> > > > holding page lock could help serialize against hwpoison handler.
> > > >
> > > > But there is still a small race window between setting hwpoison flag and
> > > > bump refcount in hwpoison handler.  It could be closed by checking
> > > > hwpoison flag in khugepaged, however this race seems unlikely to happen
> > > > in real life workload.  So just check LRU flag for now to avoid
> > > > over-engineering.
> > > >
> > > > Signed-off-by: Yang Shi <shy828301@xxxxxxxxx>
> > > > ---
> > > >  mm/khugepaged.c | 6 ++++++
> > > >  1 file changed, 6 insertions(+)
> > > >
> > > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> > > > index 045cc579f724..bdc161dc27dc 100644
> > > > --- a/mm/khugepaged.c
> > > > +++ b/mm/khugepaged.c
> > > > @@ -1808,6 +1808,12 @@ static void collapse_file(struct mm_struct *mm,
> > > >                       goto out_unlock;
> > > >               }
> > > >
> > > > +             /* The hwpoisoned page is off LRU but in page cache */
> > > > +             if (!PageLRU(page)) {
> > > > +                     result = SCAN_PAGE_LRU;
> > > > +                     goto out_unlock;
> > > > +             }
> > > > +
> > > >               if (isolate_lru_page(page)) {
> > >
> > > isolate_lru_page() should catch the case, no? TestClearPageLRU would fail
> > > and we get here.
> >
> > Hmm... you are definitely right. How could I miss this point.
> >
> > It might be because of I messed up the page state by some tests which
> > may do hole punch then reread the same index. That could drop the
> > poisoned page then collapse succeed. But I'm not sure. Anyway I didn't
> > figure out how the poisoned page could be collapsed. It seems
> > impossible. I will drop this patch.
>
> I think I figured out the problem. This problem happened after the
> page cache split patch and if the hwpoisoned page is not head page. It
> is because THP split will unfreeze the refcount of tail pages to 2
> (restore refcount from page cache) then dec refcount to 1. The
> refcount pin from hwpoison is gone and it is still on LRU. Then
> khugepged locked the page before hwpoison, the refcount is expected to
> khugepaged.
>
> The worse thing is it seems this problem is applicable to anonymous
> page too. Once the anonymous THP is split by hwpoison the pin from
> hwpoison is gone too the refcount is 1 (comes from PTE map). Then
> khugepaged could collapse it to huge page again. It may incur data
> corruption.
>
> And the poisoned page may be freed back to buddy since the lost refcount pin.
>
> If the poisoned page is head page, the code is fine since hwpoison
> doesn't put the refcount for head page after split.
>
> The fix is simple, just keep the refcount pin for hwpoisoned subpage.

Err... wait... I just realized I missed the below code block:

if (subpage == page)
        continue;

It skips the subpage passed to split_huge_page() so the refcount pin
from the caller for this subpage is kept. And hwpoison doesn't put it.
So it seems fine.

>
> >
> > >
> > > >                       result = SCAN_DEL_PAGE_LRU;
> > > >                       goto out_unlock;
> > > > --
> > > > 2.26.2
> > > >
> > > >
> > >
> > > --
> > >  Kirill A. Shutemov



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux