Re: [PATCH v5 10/13] mm/madvise: add MADV_COLLAPSE to process_madvise()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hey Rongwei,

Thanks for taking the time to review!

On Tue, May 10, 2022 at 5:49 PM Rongwei Wang
<rongwei.wang@xxxxxxxxxxxxxxxxx> wrote:
>
> Hi, Zach
>
> Thanks for your great patchset!
> Recently, We also try to collapse THP in this way, likes performance
> degradation due to using too much hugepages in our scenes.
>
> And there is a doubt about process_madvise(MADV_COLLAPSE) when we test
> this patchset:. It seems that process_madvise(MADV_COLLAPSE) rely on
> madvise(MADV_HUGEPAGE)? If the vma wasn't marked with 'hg',
> process_madvise(MADV_COLLAPSE) will fail to collapse. And if I miss
> something, please let me know.
>

I tried to have MADV_COLLAPSE follow the same THP eligibility
semantics as khugepaged and at-fault: either THP=always, or
THP=madvise and the vma is marked with MADV_HUGEPAGE, as you point
out.

If I understand you correctly, the usefulness of
process_madvise(MADV_COLLAPSE) is limited in the case where
THP=madvise and a CAP_SYS_ADMIN user is requesting a collapse of
behalf of another process since they don't have a way to mark the
target memory as eligible (which requires VM_HUGEPAGE).

If so, I think that's a valid point, and your suggestion below of a
supporting MADV_[NO]HUGEPAGE for process_madvise(2) makes sense. For
the sake of exploring all options, I'll mention that there was also a
previous idea suggested by Yang Shi where MADV_COLLAPSE could also set
VM_HUGEPAGE[1].

Since it's possible supporting MADV_[NO]HUGEPAGE for
process_madivse(2) has applications outside a subsequent
MADV_COLLAPSE, and since I don't see process_madvise(MADV_COLLAPSE) to
be in a hot path, I'd vote in favor of your suggestion and include
process_madvise(MADV_[NO]HUGEPAGE) support in v6 unless others object.

Thanks again for your review and your suggestion!
Zach

[1] https://lore.kernel.org/linux-mm/CAHbLzkqLRBd6u3qn=KqpOhRcPZtpGXbTXLUjK1z=4d_dQ06Pvw@xxxxxxxxxxxxxx/

> If so, how about introducing process_madvise(MADV_HUGEPAGE) or
> process_madvise(MADV_NOHUGEPAGE)? The former helps to mark the target
> vma with 'hg', and the collapse process can be finished completely with
> the help of other processes. the latter could let some special vma avoid
> collapsing when setting 'THP=always'.
>
> Best regards,
> -wrw
>
> On 5/5/22 5:44 AM, Zach O'Keefe wrote:
> > Allow MADV_COLLAPSE behavior for process_madvise(2) if caller has
> > CAP_SYS_ADMIN or is requesting collapse of it's own memory.
> >
> > Signed-off-by: Zach O'Keefe <zokeefe@xxxxxxxxxx>
> > ---
> >   mm/madvise.c | 6 ++++--
> >   1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/mm/madvise.c b/mm/madvise.c
> > index 638517952bd2..08c11217025a 100644
> > --- a/mm/madvise.c
> > +++ b/mm/madvise.c
> > @@ -1168,13 +1168,15 @@ madvise_behavior_valid(int behavior)
> >   }
> >
> >   static bool
> > -process_madvise_behavior_valid(int behavior)
> > +process_madvise_behavior_valid(int behavior, struct task_struct *task)
> >   {
> >       switch (behavior) {
> >       case MADV_COLD:
> >       case MADV_PAGEOUT:
> >       case MADV_WILLNEED:
> >               return true;
> > +     case MADV_COLLAPSE:
> > +             return task == current || capable(CAP_SYS_ADMIN);
> >       default:
> >               return false;
> >       }
> > @@ -1452,7 +1454,7 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec,
> >               goto free_iov;
> >       }
> >
> > -     if (!process_madvise_behavior_valid(behavior)) {
> > +     if (!process_madvise_behavior_valid(behavior, task)) {
> >               ret = -EINVAL;
> >               goto release_task;
> >       }




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux