On Wed, Apr 12, 2023 at 1:54 PM David Rientjes <rientjes@xxxxxxxxxx> wrote: > > On Wed, 12 Apr 2023, Pasha Tatashin wrote: > > > HugeTLB pages have a struct page optimizations where struct pages for tail > > pages are freed. However, when HugeTLB pages are destroyed, the memory for > > struct pages (vmemmap) need to be allocated again. > > > > Currently, __GFP_NORETRY flag is used to allocate the memory for vmemmap, > > but given that this flag makes very little effort to actually reclaim > > memory the returning of huge pages back to the system can be problem. Lets > > use __GFP_RETRY_MAYFAIL instead. This flag is also performs graceful > > reclaim without causing ooms, but at least it may perform a few retries, > > and will fail only when there is genuinely little amount of unused memory > > in the system. > > > > Thanks Pasha, this definitely makes sense. We want to free the hugetlb > page back to the system so it would be a shame to have to strand it in the > hugetlb pool because we can't allocate the tail pages (we want to free > more memory than we're allocating). > > > Signed-off-by: Pasha Tatashin <pasha.tatashin@xxxxxxxxxx> > > Suggested-by: David Rientjes <rientjes@xxxxxxxxxx> > > --- > > mm/hugetlb_vmemmap.c | 5 ++++- > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c > > index a559037cce00..c4226d2af7cc 100644 > > --- a/mm/hugetlb_vmemmap.c > > +++ b/mm/hugetlb_vmemmap.c > > @@ -475,9 +475,12 @@ int hugetlb_vmemmap_restore(const struct hstate *h, struct page *head) > > * the range is mapped to the page which @vmemmap_reuse is mapped to. > > * When a HugeTLB page is freed to the buddy allocator, previously > > * discarded vmemmap pages must be allocated and remapping. > > + * > > + * Use __GFP_RETRY_MAYFAIL to fail only when there is genuinely little > > + * unused memory in the system. > > */ > > ret = vmemmap_remap_alloc(vmemmap_start, vmemmap_end, vmemmap_reuse, > > - GFP_KERNEL | __GFP_NORETRY | __GFP_THISNODE); > > + GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_THISNODE); > > if (!ret) { > > ClearHPageVmemmapOptimized(head); > > static_branch_dec(&hugetlb_optimize_vmemmap_key); > > The behavior of __GFP_RETRY_MAYFAIL is different for high-order memory (at > least larger than PAGE_ALLOC_COSTLY_ORDER). The order that we're > allocating would depend on the implementation of alloc_vmemmap_page_list() > so likely best to move the gfp mask to that function. Thank you David. This makes sense, I will send the 2nd version soon. Pasha