On 20.10.20 10:20, Michal Hocko wrote: > On Mon 19-10-20 15:28:53, Guilherme G. Piccoli wrote: > [...] >> $ time echo 32768 > /proc/sys/vm/nr_hugepages >> real 0m24.189s >> user 0m0.000s >> sys 0m24.184s >> >> $ cat /proc/meminfo |grep "MemA\|Hugetlb" >> MemAvailable: 30784732 kB >> Hugetlb: 67108864 kB >> >> * Without this patch, init_on_alloc=0 >> $ cat /proc/meminfo |grep "MemA\|Hugetlb" >> MemAvailable: 97892752 kB >> Hugetlb: 0 kB >> >> $ time echo 32768 > /proc/sys/vm/nr_hugepages >> real 0m0.316s >> user 0m0.000s >> sys 0m0.316s > > Yes zeroying is quite costly and that is to be expected when the feature > is enabled. Hugetlb like other allocator users perform their own > initialization rather than go through __GFP_ZERO path. More on that > below. > > Could you be more specific about why this is a problem. Hugetlb pool is > usualy preallocatd once during early boot. 24s for 65GB of 2MB pages > is non trivial amount of time but it doens't look like a major disaster > either. If the pool is allocated later it can take much more time due to > memory fragmentation. > > I definitely do not want to downplay this but I would like to hear about > the real life examples of the problem. > > [...] >> >> Hi everybody, thanks in advance for the review/comments. I'd like to >> point 2 things related to the implementation: >> >> 1) I understand that adding GFP flags is not really welcome by the >> mm community; I've considered passing that as function parameter but >> that would be a hacky mess, so I decided to add the flag since it seems >> this is a fair use of the flag mechanism (to control actions on pages). >> If anybody has a better/simpler suggestion to implement this, I'm all >> ears - thanks! > > This has been discussed already (http://lkml.kernel.org/r/20190514143537.10435-4-glider@xxxxxxxxxx. > Previously it has been brought up in SLUB context AFAIR. Your numbers > are quite clear here but do we really need a gfp flag with all the > problems we tend to grow in with them? > > One potential way around this specifically for hugetlb would be to use > __GFP_ZERO when allocating from the allocator and marking the fact in > the struct page while it is sitting in the pool. Page fault handler > could then skip the zeroying phase. Not an act of beauty TBH but it > fits into the existing model of the full control over initialization. > Btw. it would allow to implement init_on_free semantic as well. I > haven't implemented the actual two main methods > hugetlb_test_clear_pre_init_page and hugetlb_mark_pre_init_page because > I am not entirely sure about the current state of hugetlb struct page in > the pool. But there should be a lot of room in there (or in tail pages). > Mike will certainly know much better. But the skeleton of the patch > would look like something like this (not even compile tested). Something like that is certainly nicer than proposed gfp flags. (__GFP_NOINIT_ON_ALLOC is just ugly, especially, to optimize such corner-case features) -- Thanks, David / dhildenb