Re: [PATCH v2 00/12] mm: userspace hugepage collapse

David Hildenbrand <david@xxxxxxxxxx> · Tue, 19 Apr 2022 22:02:53 +0200

>> E.g., have with a very sparse memory layout, we don't want to waste
>> memory by allocating memory where we actually have no page populated yet
>> -- could be user space won't reuse that memory in the foreseeable
>> future. With too many swap entries, we don't want to trigger an
>> eventually unnecessary overhead of swapping in entries if user space
>> won't access them in the foreseeable future. Something similar applies
>> to max_ptes_shared, where one might just end up wasting a lot of memory
>> eventually in some applications.
>>
>> So IMHO, with MADV_COLLAPSE we should ignore/disable any heuristics that
>> try figuring out what user space might be doing. We know exactly what
>> user space asks for -- and that can be documented properly.
>>

Just a thought, if we ever want to implement khugepaged in user space,
it could theoretically obtain similar information using e.g., the
pagemap. It wouldn't be race-free, but the question is if it would matter.

I consider the primary use case of giving an application more precise
control over actual THP placement.

> 
> Sounds good to me. Would you also be in favor of decoupling allocation
> semantics from khugepaged? I.e. we'll pick some default gfp flags and
> not depend on /sys/kernel/mm/transparent_hugepage/khugepaged/defrag?

Good question. It's not really a heuristic like that other stuff.

Easy answer: we're not dealing with khugepaged, so anything in
/sys/kernel/mm/transparent_hugepage/khugepaged/ shouldn't apply?

Sure, we could have a separate toggles for MADV_COLLAPSE.

Maybe we simply want a dedicated syscall where we can specify additional
options ... but maybe that simply over-complicates the problem.

-- 
Thanks,

David / dhildenb