Re: [PATCH 1/2] mm: Make alloc_contig_range handle free hugetlb pages

Oscar Salvador <osalvador@xxxxxxx> · Thu, 18 Feb 2021 11:09:17 +0100

On Wed, Feb 17, 2021 at 04:00:11PM +0100, Michal Hocko wrote:
> Is this really necessary? dissolve_free_huge_page will take care of this
> and the race windown you are covering is really tiny.

Probably not, I was trying to shrink to race window as much as possible
but the call to dissolve_free_huge_page might be enough.

> > +	nid = page_to_nid(page);
> > +	spin_unlock(&hugetlb_lock);
> > +
> > +	/*
> > +	 * Before dissolving the page, we need to allocate a new one,
> > +	 * so the pool remains stable.
> > +	 */
> > +	new_page = alloc_fresh_huge_page(h, gfp_mask, nid, nmask, NULL);
> 
> wrt. fallback to other zones, I haven't realized that the primary
> usecase is a form of memory offlining (from virt-mem). I am not yet sure
> what the proper behavior is in that case but if breaking hugetlb pools,
> similar to the normal hotplug operation, is viable then this needs a
> special mode. We do not want a random alloc_contig_range user to do the
> same. So for starter I would go with __GFP_THISNODE here.

Ok, makes sense.
__GFP_THISNODE will not allow fallback to other node's zones.
Since we only allow the nid the page belongs to, nodemask should be
NULL, right?

> > +	if (!h)
> > +		/*
> > +		 * The page might have been dissolved from under our feet.
> > +		 * If that is the case, return success as if we dissolved it
> > +		 * ourselves.
> > +		 */
> > +		return true;
> 
> nit I would put the comment above the conditin for both cases. It reads
> more easily that way. At least without { }.

Yes, makes sense.

> Other than that I haven't noticed any surprises.

I did. The 'put_page' call should be placed above, right after getting
the page. Otherwise, refcount == 1 and we will fail to dissolve the
new page if we need to (in case old page fails to be dissolved).
I already fixed that locally.

-- 
Oscar Salvador
SUSE L3