Re: [PATCH v10 3/3] mm/khugepaged: recover from poisoned file-backed memory

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 24 Mar 2023, Jiaqi Yan wrote:
> On Fri, Mar 24, 2023 at 2:15 PM Yang Shi <shy828301@xxxxxxxxx> wrote:
> > On Sat, Mar 4, 2023 at 10:51 PM Jiaqi Yan <jiaqiyan@xxxxxxxxxx> wrote:
> > >
> > > Make collapse_file roll back when copying pages failed. More concretely:
> > > - extract copying operations into a separate loop
> > > - postpone the updates for nr_none until both scanning and copying
> > >   succeeded
> > > - postpone joining small xarray entries until both scanning and copying
> > >   succeeded
> > > - postpone the update operations to NR_XXX_THPS until both scanning and
> > >   copying succeeded
> > > - for non-SHMEM file, roll back filemap_nr_thps_inc if scan succeeded but
> > >   copying failed
> > >
> > > Tested manually:
> > > 0. Enable khugepaged on system under test. Mount tmpfs at /mnt/ramdisk.
> > > 1. Start a two-thread application. Each thread allocates a chunk of
> > >    non-huge memory buffer from /mnt/ramdisk.
> > > 2. Pick 4 random buffer address (2 in each thread) and inject
> > >    uncorrectable memory errors at physical addresses.
> > > 3. Signal both threads to make their memory buffer collapsible, i.e.
> > >    calling madvise(MADV_HUGEPAGE).
> > > 4. Wait and then check kernel log: khugepaged is able to recover from
> > >    poisoned pages by skipping them.
> > > 5. Signal both threads to inspect their buffer contents and make sure no
> > >    data corruption.
> > >
> > > Signed-off-by: Jiaqi Yan <jiaqiyan@xxxxxxxxxx>
> >
> > Reviewed-by: Yang Shi <shy828301@xxxxxxxxx>
> >
> > Just a nit below:

Acked-by: Hugh Dickins <hughd@xxxxxxxxxx>

with a little nit from me below, if you are respinning:

> >
> > > ---
> > >  mm/khugepaged.c | 78 ++++++++++++++++++++++++++++++-------------------
> > >  1 file changed, 48 insertions(+), 30 deletions(-)
> > >
> > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> > > index c3c217f6ebc6e..3ea2aa55c2c52 100644
> > > --- a/mm/khugepaged.c
> > > +++ b/mm/khugepaged.c
> > > @@ -1890,6 +1890,9 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
> > >  {
> > >         struct address_space *mapping = file->f_mapping;
> > >         struct page *hpage;
> > > +       struct page *page;
> > > +       struct page *tmp;
> > > +       struct folio *folio;
> > >         pgoff_t index = 0, end = start + HPAGE_PMD_NR;
> > >         LIST_HEAD(pagelist);
> > >         XA_STATE_ORDER(xas, &mapping->i_pages, start, HPAGE_PMD_ORDER);
> > > @@ -1934,8 +1937,7 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
> > >
> > >         xas_set(&xas, start);
> > >         for (index = start; index < end; index++) {
> > > -               struct page *page = xas_next(&xas);
> > > -               struct folio *folio;
> > > +               page = xas_next(&xas);
> > >
> > >                 VM_BUG_ON(index != xas.xa_index);
> > >                 if (is_shmem) {
> > > @@ -2117,10 +2119,7 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
> > >         }
> > >         nr = thp_nr_pages(hpage);
> > >
> > > -       if (is_shmem)
> > > -               __mod_lruvec_page_state(hpage, NR_SHMEM_THPS, nr);
> > > -       else {
> > > -               __mod_lruvec_page_state(hpage, NR_FILE_THPS, nr);
> > > +       if (!is_shmem) {
> > >                 filemap_nr_thps_inc(mapping);
> > >                 /*
> > >                  * Paired with smp_mb() in do_dentry_open() to ensure

That "nr = thp_nr_pages(hpage);" above becomes stranded a long way away
from where "nr" is actually used for updating those statistics: please
move it down with them.  (I see "nr" is also reported in the tracepoint
at the end, FWIW, so maybe that will show "0" in more failure cases than
it used to, but that's okay - it has been decently initialized.)

Thanks,
Hugh

[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux