Re: [PATCH v9 4/4] mm: hugetlb_vmemmap: add hugetlb_optimize_vmemmap sysctl

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, May 05, 2022 at 09:48:34AM -0700, Mike Kravetz wrote:
> On 5/5/22 01:02, Muchun Song wrote:
> > On Wed, May 04, 2022 at 08:36:00PM -0700, Mike Kravetz wrote:
> >> On 5/4/22 19:35, Muchun Song wrote:
> >>> On Wed, May 04, 2022 at 03:12:39PM -0700, Mike Kravetz wrote:
> >>>> On 4/29/22 05:18, Muchun Song wrote:
> >>>>> +static void vmemmap_optimize_mode_switch(enum vmemmap_optimize_mode to)
> >>>>> +{
> >>>>> +	if (vmemmap_optimize_mode == to)
> >>>>> +		return;
> >>>>> +
> >>>>> +	if (to == VMEMMAP_OPTIMIZE_OFF)
> >>>>> +		static_branch_dec(&hugetlb_optimize_vmemmap_key);
> >>>>> +	else
> >>>>> +		static_branch_inc(&hugetlb_optimize_vmemmap_key);
> >>>>> +	vmemmap_optimize_mode = to;
> >>>>> +}
> >>>>> +
> >>>>>  static int __init hugetlb_vmemmap_early_param(char *buf)
> >>>>>  {
> >>>>>  	bool enable;
> >>>>> +	enum vmemmap_optimize_mode mode;
> >>>>>  
> >>>>>  	if (kstrtobool(buf, &enable))
> >>>>>  		return -EINVAL;
> >>>>>  
> >>>>> -	if (enable)
> >>>>> -		static_branch_enable(&hugetlb_optimize_vmemmap_key);
> >>>>> -	else
> >>>>> -		static_branch_disable(&hugetlb_optimize_vmemmap_key);
> >>>>> +	mode = enable ? VMEMMAP_OPTIMIZE_ON : VMEMMAP_OPTIMIZE_OFF;
> >>>>> +	vmemmap_optimize_mode_switch(mode);
> >>>>>  
> >>>>>  	return 0;
> >>>>>  }
> >>>>> @@ -60,6 +80,8 @@ int hugetlb_vmemmap_alloc(struct hstate *h, struct page *head)
> >>>>>  	vmemmap_end	= vmemmap_addr + (vmemmap_pages << PAGE_SHIFT);
> >>>>>  	vmemmap_reuse	= vmemmap_addr - PAGE_SIZE;
> >>>>>  
> >>>>> +	VM_BUG_ON_PAGE(!vmemmap_pages, head);
> >>>>> +
> >>>>>  	/*
> >>>>>  	 * The pages which the vmemmap virtual address range [@vmemmap_addr,
> >>>>>  	 * @vmemmap_end) are mapped to are freed to the buddy allocator, and
> >>>>> @@ -69,8 +91,10 @@ int hugetlb_vmemmap_alloc(struct hstate *h, struct page *head)
> >>>>>  	 */
> >>>>>  	ret = vmemmap_remap_alloc(vmemmap_addr, vmemmap_end, vmemmap_reuse,
> >>>>>  				  GFP_KERNEL | __GFP_NORETRY | __GFP_THISNODE);
> >>>>> -	if (!ret)
> >>>>> +	if (!ret) {
> >>>>>  		ClearHPageVmemmapOptimized(head);
> >>>>> +		static_branch_dec(&hugetlb_optimize_vmemmap_key);
> >>>>> +	}
> >>>>>  
> >>>>>  	return ret;
> >>>>>  }
> >>>>> @@ -84,6 +108,8 @@ void hugetlb_vmemmap_free(struct hstate *h, struct page *head)
> >>>>>  	if (!vmemmap_pages)
> >>>>>  		return;
> >>>>>  
> >>>>> +	static_branch_inc(&hugetlb_optimize_vmemmap_key);
> >>>>
> >>>> Can you explain the reasoning behind doing the static_branch_inc here in free,
> >>>> and static_branch_dec in alloc?
> >>>> IIUC, they may not be absolutely necessary but you could use the count to
> >>>> know how many optimized pages are in use?  Or, I may just be missing
> >>>> something.
> >>>>
> >>>
> >>> Partly right. One 'count' is not enough. I have implemented this with similar
> >>> approach in v6 [1]. Except the 'count', we also need a lock to do synchronization.
> >>> However, both count and synchronization are included in static_key_inc/dec
> >>> infrastructure. It is simpler to use static_key_inc/dec directly, right? 
> >>>
> >>> [1] https://lore.kernel.org/all/20220330153745.20465-5-songmuchun@xxxxxxxxxxxxx/
> >>>
> >>
> >> Sorry, but I am a little confused.
> >>
> >> vmemmap_optimize_mode_switch will static_key_inc to enable and static_key_dec
> >> to disable.  In addition each time we optimize (allocate) a hugetlb page after
> >> enabling we will static_key_inc.
> >>
> >> Suppose we have 1 hugetlb page optimized.  So static count == 2 IIUC.
> >> The someone turns off optimization via sysctl.  static count == 1 ???
> > 
> > Definitely right.
> > 
> >> If we then add another hugetlb page via nr_hugepages it seems that it
> >> would be optimized as static count == 1.  Is that correct?  Do we need
> > 
> > I'm wrong.
> > 
> >> to free all hugetlb pages with optimization before we can add new pages
> >> without optimization?
> >>
> > 
> > My bad. I think the following code would fix this.
> > 
> > Thanks for your review carefully.
> > 
> > diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
> > index 5820a681a724..997e192aeed7 100644
> > --- a/mm/hugetlb_vmemmap.c
> > +++ b/mm/hugetlb_vmemmap.c
> > @@ -105,7 +105,7 @@ void hugetlb_vmemmap_free(struct hstate *h, struct page *head)
> >         unsigned long vmemmap_end, vmemmap_reuse, vmemmap_pages;
> > 
> >         vmemmap_pages = hugetlb_optimize_vmemmap_pages(h);
> > -       if (!vmemmap_pages)
> > +       if (!vmemmap_pages || READ_ONCE(vmemmap_optimize_mode) == VMEMMAP_OPTIMIZE_OFF)
> >                 return;
> > 
> >         static_branch_inc(&hugetlb_optimize_vmemmap_key);
> >  
> 
> If vmemmap_optimize_mode == VMEMMAP_OPTIMIZE_OFF is sufficient for turning
> off optimizations, do we really need to static_branch_inc/dev for each
> hugetlb page?
>

static_branch_inc/dec is necessary since the user could change
vmemmap_optimize_mode to off after the 'if' judgement.

CPU0:				CPU1:
// Assume vmemmap_optimize_mode == 1
// and static_key_count == 1
if (vmemmap_optimize_mode == VMEMMAP_OPTIMIZE_OFF)
	return;
				hugetlb_optimize_vmemmap_handler();
					vmemmap_optimize_mode = 0;
					static_branch_dec();
					// static_key_count == 0
// Enable static_key if necessary
static_branch_inc();

Does this make sense for you?

Thanks.



[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux