Re: [RFC PATCH 07/14] mm/khugepaged: add vm_flags_ignore to hugepage_vma_revalidate_pmd_count()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On Tue, Mar 8, 2022 at 1:35 PM Zach O'Keefe <zokeefe@xxxxxxxxxx> wrote:
> >
> > In madvise collapse context, we optionally want to be able to ignore
> > advice from MADV_NOHUGEPAGE-marked regions.
>
> Could you please elaborate why this usecase is valid? Typically
> MADV_NOHUGEPAGE is set when the users really don't want to have THP
> for this area. So it doesn't make too much sense to ignore it IMHO.
>

Hey Yang, thanks for taking time to review and comment.

Semantically, the way I see it, is that MADV_NOHUGEPAGE is a way for
the user to say "I don't want hugepages here", so that the kernel
knows not to do so when faulting memory, and khugepaged can stay away.
However, in MADV_COLLAPSE, the user is explicitly requesting this be
backed by hugepages - so presumably that is exactly what they want.

IOW, if the user didn't want this memory to be backed by hugepages,
they wouldn't be MADV_COLLAPSE'ing it. If there was a range of memory
the user wanted collapsed, but that had some sub-areas marked
MADV_NOHUGEPAGE, they could always issue multiple MADV_COLLAPSE
operations around the excluded regions.

In terms of use cases, I don't have a concrete example, but a user
could hypothetically choose to exclude regions from management from
khugepaged, but still be able to collapse the memory themselves,
when/if they deem appropriate.

> >
> > Add a vm_flags_ignore argument to hugepage_vma_revalidate_pmd_count()
> > which can be used to ignore vm flags used when considering thp
> > eligibility.
> >
> > Signed-off-by: Zach O'Keefe <zokeefe@xxxxxxxxxx>
> > ---
> >  mm/khugepaged.c | 18 ++++++++++++------
> >  1 file changed, 12 insertions(+), 6 deletions(-)
> >
> > diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> > index 1d20be47bcea..ecbd3fc41c80 100644
> > --- a/mm/khugepaged.c
> > +++ b/mm/khugepaged.c
> > @@ -964,10 +964,14 @@ khugepaged_alloc_page(struct page **hpage, gfp_t gfp, int node)
> >  #endif
> >
> >  /*
> > - * Revalidate a vma's eligibility to collapse nr hugepages.
> > + * Revalidate a vma's eligibility to collapse nr hugepages. vm_flags_ignore
> > + * can be used to ignore certain vma_flags that would otherwise be checked -
> > + * the principal example being VM_NOHUGEPAGE which is ignored in madvise
> > + * collapse context.
> >   */
> >  static int hugepage_vma_revalidate_pmd_count(struct mm_struct *mm,
> >                                              unsigned long address, int nr,
> > +                                            unsigned long vm_flags_ignore,
> >                                              struct vm_area_struct **vmap)
> >  {
> >         struct vm_area_struct *vma;
> > @@ -986,7 +990,7 @@ static int hugepage_vma_revalidate_pmd_count(struct mm_struct *mm,
> >         hend = vma->vm_end & HPAGE_PMD_MASK;
> >         if (address < hstart || (address + nr * HPAGE_PMD_SIZE) > hend)
> >                 return SCAN_ADDRESS_RANGE;
> > -       if (!hugepage_vma_check(vma, vma->vm_flags))
> > +       if (!hugepage_vma_check(vma, vma->vm_flags & ~vm_flags_ignore))
> >                 return SCAN_VMA_CHECK;
> >         /* Anon VMA expected */
> >         if (!vma->anon_vma || vma->vm_ops)
> > @@ -1000,9 +1004,11 @@ static int hugepage_vma_revalidate_pmd_count(struct mm_struct *mm,
> >   */
> >
> >  static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address,
> > +                                  unsigned long vm_flags_ignore,
> >                                    struct vm_area_struct **vmap)
> >  {
> > -       return hugepage_vma_revalidate_pmd_count(mm, address, 1, vmap);
> > +       return hugepage_vma_revalidate_pmd_count(mm, address, 1,
> > +                       vm_flags_ignore, vmap);
> >  }
> >
> >  /*
> > @@ -1043,7 +1049,7 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm,
> >                 /* do_swap_page returns VM_FAULT_RETRY with released mmap_lock */
> >                 if (ret & VM_FAULT_RETRY) {
> >                         mmap_read_lock(mm);
> > -                       if (hugepage_vma_revalidate(mm, haddr, &vma)) {
> > +                       if (hugepage_vma_revalidate(mm, haddr, VM_NONE, &vma)) {
> >                                 /* vma is no longer available, don't continue to swapin */
> >                                 trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0);
> >                                 return false;
> > @@ -1200,7 +1206,7 @@ static void collapse_huge_page(struct mm_struct *mm,
> >         count_memcg_page_event(new_page, THP_COLLAPSE_ALLOC);
> >
> >         mmap_read_lock(mm);
> > -       result = hugepage_vma_revalidate(mm, address, &vma);
> > +       result = hugepage_vma_revalidate(mm, address, VM_NONE, &vma);
> >         if (result) {
> >                 mmap_read_unlock(mm);
> >                 goto out_nolock;
> > @@ -1232,7 +1238,7 @@ static void collapse_huge_page(struct mm_struct *mm,
> >          */
> >         mmap_write_lock(mm);
> >
> > -       result = hugepage_vma_revalidate(mm, address, &vma);
> > +       result = hugepage_vma_revalidate(mm, address, VM_NONE, &vma);
> >         if (result)
> >                 goto out_up_write;
> >         /* check if the pmd is still valid */
> > --
> > 2.35.1.616.g0bdcbb4464-goog
> >




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux