Re: [PATCH v4 01/10] mm: vmscan: add validation before spliting shmem large folio

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 08, 2024 at 05:34:50PM GMT, Baolin Wang wrote:
> 
> 
> On 2024/8/8 16:51, David Hildenbrand wrote:
> > On 08.08.24 04:36, Baolin Wang wrote:
> > > 
> > > 
> > > On 2024/8/7 23:53, David Hildenbrand wrote:
> > > > On 07.08.24 09:31, Baolin Wang wrote:
> > > > > Page reclaim will not scan anon LRU if no swap space, however
> > > > > MADV_PAGEOUT
> > > > > can still split shmem large folios even without a swap device. Thus add
> > > > > swap available space validation before spliting shmem large folio to
> > > > > avoid redundant split.
> > > > > 
> > > > > Signed-off-by: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx>
> > > > > ---
> > > > >    mm/vmscan.c | 8 ++++++++
> > > > >    1 file changed, 8 insertions(+)
> > > > > 
> > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > > > > index 31d13462571e..796f65781f4f 100644
> > > > > --- a/mm/vmscan.c
> > > > > +++ b/mm/vmscan.c
> > > > > @@ -1259,6 +1259,14 @@ static unsigned int shrink_folio_list(struct
> > > > > list_head *folio_list,
> > > > >                }
> > > > >            } else if (folio_test_swapbacked(folio) &&
> > > > >                   folio_test_large(folio)) {
> > > > > +
> > > > > +            /*
> > > > > +             * Do not split shmem folio if no swap memory
> > > > > +             * available.
> > > > > +             */
> > > > > +            if (!total_swap_pages)
> > > > > +                goto activate_locked;
> > > > > +
> > > > >                /* Split shmem folio */
> > > > >                if (split_folio_to_list(folio, folio_list))
> > > > >                    goto keep_locked;
> > > > 
> > > > Reminds me of
> > > > 
> > > > commit 9a976f0c847b67d22ed694556a3626ed92da0422
> > > > Author: Luis Chamberlain <mcgrof@xxxxxxxxxx>
> > > > Date:   Thu Mar 9 15:05:43 2023 -0800
> > > > 
> > > >       shmem: skip page split if we're not reclaiming
> > > >       In theory when info->flags & VM_LOCKED we should not be getting
> > > >       shem_writepage() called so we should be verifying this with a
> > > >       WARN_ON_ONCE().  Since we should not be swapping then best to
> > > > ensure we
> > > >       also don't do the folio split earlier too.  So just move the check
> > > > early
> > > >       to avoid folio splits in case its a dubious call.
> > > >       We also have a similar early bail when !total_swap_pages so just
> > > > move that
> > > >       earlier to avoid the possible folio split in the same situation.
> > > > 
> > > > 
> > > > But indeed, pageout() -> writepage() is called *after* the split in the
> > > > vmscan path.
> > > > 
> > > > In that "noswap" context, I wonder if we also want to skip folios part
> > > > of shmem
> > > > with disabled swapping?
> > > 
> > > Yes, I think so.
> > > 
> > > > 
> > > > But now I am wondering under which circumstances we end up calling
> > > > shmem_writepage() with a large folio. And I think the answer is the
> > > > comment of
> > > > folio_test_large(): via drivers/gpu/drm/i915/gem/i915_gem_shmem.c.
> > > > 
> > > > 
> > > > ... so if shmem_writepage() handles+checks that, could we do
> > > > 
> > > > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > > > index a332cb80e928..7dfa3d6e8ba7 100644
> > > > --- a/mm/vmscan.c
> > > > +++ b/mm/vmscan.c
> > > > @@ -1257,11 +1257,6 @@ static unsigned int shrink_folio_list(struct
> > > > list_head *folio_list,
> > > >                                                   goto
> > > > activate_locked_split;
> > > >                                   }
> > > >                           }
> > > > -               } else if (folio_test_swapbacked(folio) &&
> > > > -                          folio_test_large(folio)) {
> > > > -                       /* Split shmem folio */
> > > > -                       if (split_folio_to_list(folio, folio_list))
> > > > -                               goto keep_locked;
> > > >                   }
> > > > 
> > > >                   /*
> > > > 
> > > > instead?
> > > 
> > > Seems reasonable to me. But we should pass the 'folio_list' to
> > > shmem_writepage() to list the subpages of the large folio. Let me try.
> > 
> > Ah, yes, good point. Alternatively, we'd have to split and try writing
> > all subpages in there. I wonder what to do if we fail to write some, and
> > if we could handle that transparently, without the folio_list.
> 
> After some investigation, I prefer to pass 'folio_list' to shmem_writepage()
> via 'struct writeback_control', which could simplify the logic a lot.
> Otherwise, we need to handle each subpage's writeback/reclaim/dirty state,
> as well as tracking each subpage's write result, which seems more
> complicated.
> 
> I made a quick change by passing 'folio_list', and it looks simple and works
> as expected.
> 
> diff --git a/include/linux/writeback.h b/include/linux/writeback.h
> index 75196b0f894f..10100e22d5c6 100644
> --- a/include/linux/writeback.h
> +++ b/include/linux/writeback.h
> @@ -80,6 +80,9 @@ struct writeback_control {
>          */
>         struct swap_iocb **swap_plug;
> 
> +       /* Target list for splitting a large folio */
> +       struct list_head *list;
> +
>         /* internal fields used by the ->writepages implementation: */
>         struct folio_batch fbatch;
>         pgoff_t index;
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 05525e9e7423..0a5a68f7d0a0 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -1496,9 +1496,10 @@ static int shmem_writepage(struct page *page, struct
> writeback_control *wbc)
>          * and its shmem_writeback() needs them to be split when swapping.
>          */
>         if (wbc->split_large_folio && folio_test_large(folio)) {
> +try_split:
>                 /* Ensure the subpages are still dirty */
>                 folio_test_set_dirty(folio);
> -               if (split_huge_page(page) < 0)
> +               if (split_huge_page_to_list_to_order(page, wbc->list, 0))

We check for split_large_folio, but we still send the wbc->list for i915
writepage() case. Previously, we were sending a NULL list. Shouldn't we address
that case too?

>                         goto redirty;
>                 folio = page_folio(page);
>                 folio_clear_dirty(folio);
> @@ -1540,8 +1541,12 @@ static int shmem_writepage(struct page *page, struct
> writeback_control *wbc)
>         }
> 
>         swap = folio_alloc_swap(folio);
> -       if (!swap.val)
> +       if (!swap.val) {
> +               if (nr_pages > 1)
> +                       goto try_split;
> +
>                 goto redirty;
> +       }
> 
>         /*
>          * Add inode to shmem_unuse()'s list of swapped-out inodes,
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 277571815789..cf982cf2454f 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -628,7 +628,7 @@ typedef enum {
>   * Calls ->writepage().
>   */
>  static pageout_t pageout(struct folio *folio, struct address_space
> *mapping,
> -                        struct swap_iocb **plug)
> +                        struct swap_iocb **plug, struct list_head
> *folio_list)
>  {
>         /*
>          * If the folio is dirty, only perform writeback if that write
> @@ -676,6 +676,11 @@ static pageout_t pageout(struct folio *folio, struct
> address_space *mapping,
>                         .swap_plug = plug,
>                 };
> 
> +               if (shmem_mapping(mapping)) {
> +                       wbc.list = folio_list;
> +                       wbc.split_large_folio =
> !IS_ENABLED(CONFIG_THP_SWAP);
> +               }
> +
>                 folio_set_reclaim(folio);
>                 res = mapping->a_ops->writepage(&folio->page, &wbc);
>                 if (res < 0)
> @@ -1259,23 +1264,6 @@ static unsigned int shrink_folio_list(struct
> list_head *folio_list,
>                                                 goto activate_locked_split;
>                                 }
>                         }
> -               } else if (folio_test_swapbacked(folio) &&
> -                          folio_test_large(folio)) {
> -
> -                       /*
> -                        * Do not split shmem folio if no swap memory
> -                        * available.
> -                        */
> -                       if (!total_swap_pages)
> -                               goto activate_locked;
> -
> -                       /*
> -                        * Only split shmem folio when CONFIG_THP_SWAP
> -                        * is not enabled.
> -                        */
> -                       if (!IS_ENABLED(CONFIG_THP_SWAP) &&
> -                           split_folio_to_list(folio, folio_list))
> -                               goto keep_locked;
>                 }
> 
>                 /*
> @@ -1377,18 +1365,21 @@ static unsigned int shrink_folio_list(struct
> list_head *folio_list,
>                          * starts and then write it out here.
>                          */
>                         try_to_unmap_flush_dirty();
> -try_pageout:
> -                       switch (pageout(folio, mapping, &plug)) {
> +                       switch (pageout(folio, mapping, &plug, folio_list))
> {
>                         case PAGE_KEEP:
>                                 goto keep_locked;
>                         case PAGE_ACTIVATE:
> -                               if (shmem_mapping(mapping) &&
> folio_test_large(folio) &&
> -                                   !split_folio_to_list(folio, folio_list))
> {
> +                               /* Shmem can be split when writeback to swap
> */
> +                               if ((nr_pages > 1) &&
> !folio_test_large(folio)) {
> +                                       sc->nr_scanned -= (nr_pages - 1);
>                                         nr_pages = 1;
> -                                       goto try_pageout;
>                                 }
>                                 goto activate_locked;
>                         case PAGE_SUCCESS:
> +                               if ((nr_pages > 1) &&
> !folio_test_large(folio)) {
> +                                       sc->nr_scanned -= (nr_pages - 1);
> +                                       nr_pages = 1;
> +                               }
>                                 stat->nr_pageout += nr_pages;
> 
>                                 if (folio_test_writeback(folio)) {




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux