Re: [PATCH] mm, hugetlb: Avoid double clearing for hugetlb pages

David Hildenbrand <david@xxxxxxxxxx> · Tue, 20 Oct 2020 15:36:18 +0200

On 20.10.20 10:20, Michal Hocko wrote:
> On Mon 19-10-20 15:28:53, Guilherme G. Piccoli wrote:
> [...]
>> $ time echo 32768 > /proc/sys/vm/nr_hugepages
>> real    0m24.189s
>> user    0m0.000s
>> sys     0m24.184s
>>
>> $ cat /proc/meminfo |grep "MemA\|Hugetlb"
>> MemAvailable:   30784732 kB
>> Hugetlb:        67108864 kB
>>
>> * Without this patch, init_on_alloc=0
>> $ cat /proc/meminfo |grep "MemA\|Hugetlb"
>> MemAvailable:   97892752 kB
>> Hugetlb:               0 kB
>>
>> $ time echo 32768 > /proc/sys/vm/nr_hugepages
>> real    0m0.316s
>> user    0m0.000s
>> sys     0m0.316s
> 
> Yes zeroying is quite costly and that is to be expected when the feature
> is enabled. Hugetlb like other allocator users perform their own
> initialization rather than go through __GFP_ZERO path. More on that
> below.
> 
> Could you be more specific about why this is a problem. Hugetlb pool is
> usualy preallocatd once during early boot. 24s for 65GB of 2MB pages
> is non trivial amount of time but it doens't look like a major disaster
> either. If the pool is allocated later it can take much more time due to
> memory fragmentation.
> 
> I definitely do not want to downplay this but I would like to hear about
> the real life examples of the problem.
> 
> [...]
>>
>> Hi everybody, thanks in advance for the review/comments. I'd like to
>> point 2 things related to the implementation:
>>
>> 1) I understand that adding GFP flags is not really welcome by the
>> mm community; I've considered passing that as function parameter but
>> that would be a hacky mess, so I decided to add the flag since it seems
>> this is a fair use of the flag mechanism (to control actions on pages).
>> If anybody has a better/simpler suggestion to implement this, I'm all
>> ears - thanks!
> 
> This has been discussed already (http://lkml.kernel.org/r/20190514143537.10435-4-glider@xxxxxxxxxx.
> Previously it has been brought up in SLUB context AFAIR. Your numbers
> are quite clear here but do we really need a gfp flag with all the
> problems we tend to grow in with them?
> 
> One potential way around this specifically for hugetlb would be to use
> __GFP_ZERO when allocating from the allocator and marking the fact in
> the struct page while it is sitting in the pool. Page fault handler
> could then skip the zeroying phase. Not an act of beauty TBH but it
> fits into the existing model of the full control over initialization.
> Btw. it would allow to implement init_on_free semantic as well. I
> haven't implemented the actual two main methods
> hugetlb_test_clear_pre_init_page and hugetlb_mark_pre_init_page because
> I am not entirely sure about the current state of hugetlb struct page in
> the pool. But there should be a lot of room in there (or in tail pages).
> Mike will certainly know much better. But the skeleton of the patch
> would look like something like this (not even compile tested).

Something like that is certainly nicer than proposed gfp flags.
(__GFP_NOINIT_ON_ALLOC is just ugly, especially, to optimize such
corner-case features)

-- 
Thanks,

David / dhildenb