Re: [PATCH v2 00/12] mm: userspace hugepage collapse

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hey Peter,

Thanks for taking the time to review!

On Thu, Apr 14, 2022 at 5:04 PM Peter Xu <peterx@xxxxxxxxxx> wrote:
>
> Hi, Zach,
>
> On Thu, Apr 14, 2022 at 11:06:00AM -0700, Zach O'Keefe wrote:
> > process_madvise(2)
> >
> >       Performs a synchronous collapse of the native pages
> >       mapped by the list of iovecs into transparent hugepages.
> >
> >       Allocation semantics are the same as khugepaged, and depend on
> >       (1) the active sysfs settings
> >       /sys/kernel/mm/transparent_hugepage/enabled and
> >       /sys/kernel/mm/transparent_hugepage/khugepaged/defrag, and (2)
> >       the VMA flags of the memory range being collapsed.
> >
> >       Collapse eligibility criteria differs from khugepaged in that
> >       the sysfs files
> >       /sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_[none|swap|shared]
> >       are ignored.
>
> The userspace khugepaged idea definitely makes sense to me, though I'm
> curious how the line is drown on the different behaviors here by explicitly
> ignoring the max_ptes_* entries.
>
> Let's assume the initiative is to duplicate a more data-aware khugepaged in
> the userspace, then IMHO it makes more sense to start with all the policies
> that applies to khugepaged already, including max_pte_*.
>
> I can understand the willingness to provide even stronger semantics here
> than khugepaged since the userspace could have very clear knowledge of how
> to provision the memories (better than a kernel scanner).  It's just that
> IMHO it could be slightly confusing if the new interface only partially
> apply the khugepaged rules.
>
> No strong opinion here.  It could already been a trade-off after the
> discussion from the RFC with Michal which I read..  Just curious about how
> you made that design decision so feel free to read it as a pure question.
>

Understand your point here. The allocation and max_pte_* semantics are
split between khugepaged-like and fault-like, respectively - which
could be confusing. Originally, I proposed a MADV_F_COLLAPSE_LIMITS
flag to control the former's behavior, but agreed to keep things
simple to start, and expand the interface if/when necessary. I opted
to ignore max_ptes_* as the default since I envisioned that early
adopters would "just want it to work". One such example would be
backing executable text by hugepages on program load when many pages
haven't been demand-paged in yet.

What do you think?

Thanks,
Zach

> Thanks,
>
> --
> Peter Xu
>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux