Re: [PATCH v2 00/12] mm: userspace hugepage collapse

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Apr 19, 2022 at 1:03 PM David Hildenbrand <david@xxxxxxxxxx> wrote:
>
> >> E.g., have with a very sparse memory layout, we don't want to waste
> >> memory by allocating memory where we actually have no page populated yet
> >> -- could be user space won't reuse that memory in the foreseeable
> >> future. With too many swap entries, we don't want to trigger an
> >> eventually unnecessary overhead of swapping in entries if user space
> >> won't access them in the foreseeable future. Something similar applies
> >> to max_ptes_shared, where one might just end up wasting a lot of memory
> >> eventually in some applications.
> >>
> >> So IMHO, with MADV_COLLAPSE we should ignore/disable any heuristics that
> >> try figuring out what user space might be doing. We know exactly what
> >> user space asks for -- and that can be documented properly.
> >>
>
> Just a thought, if we ever want to implement khugepaged in user space,
> it could theoretically obtain similar information using e.g., the
> pagemap. It wouldn't be race-free, but the question is if it would matter.
>
> I consider the primary use case of giving an application more precise
> control over actual THP placement.
>

Good point about the pagemap and agree about the primary use case -
I'll make that clear in v3 cover letter.

> >
> > Sounds good to me. Would you also be in favor of decoupling allocation
> > semantics from khugepaged? I.e. we'll pick some default gfp flags and
> > not depend on /sys/kernel/mm/transparent_hugepage/khugepaged/defrag?
>
> Good question. It's not really a heuristic like that other stuff.
>
> Easy answer: we're not dealing with khugepaged, so anything in
> /sys/kernel/mm/transparent_hugepage/khugepaged/ shouldn't apply?
>

That's what I'm thinking now too. If there's no objections, I'll
proceed in that direction for v3.

> Sure, we could have a separate toggles for MADV_COLLAPSE.
>
> Maybe we simply want a dedicated syscall where we can specify additional
> options ... but maybe that simply over-complicates the problem.
>

Thankfully process_madvise(2) has flags, and madvise(2) users can
always migrate to using process_madvise(2) on self. Piggy-backing off
madvise infrastructure for these "non-advice actions" (e.g.
MADV_PAGEOUT) seems to be the norm.

Thanks as always for your time and thoughts!

Zach

> --
> Thanks,
>
> David / dhildenb
>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux