Re: [PATCH v3 03/12] mm/khugepaged: make hugepage allocation context-specific

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Apr 27, 2022 at 05:51:53PM -0700, Zach O'Keefe wrote:
> > > @@ -1110,13 +1115,14 @@ static void collapse_huge_page(struct mm_struct *mm,
> > >        */
> > >       mmap_read_unlock(mm);
> > >
> > > +     node = khugepaged_find_target_node(cc);
> > >       /* sched to specified node before huage page memory copy */
> > >       if (task_node(current) != node) {
> > >               cpumask = cpumask_of_node(node);
> > >               if (!cpumask_empty(cpumask))
> > >                       set_cpus_allowed_ptr(current, cpumask);
> > >       }
> > > -     new_page = khugepaged_alloc_page(hpage, gfp, node);
> > > +     new_page = cc->alloc_hpage(cc, gfp, node);
> >
> > AFAICT you removed all references of khugepaged_alloc_page() in this patch,
> > then you'd better drop the function for both NUMA and UMA.
> >
> 
> Sorry, I'm not sure I follow. In khugepaged context, logic WRT
> khugepaged_alloc_page() is unchanged - it's just called indirectly
> through ->alloc_hpage().

Ah you're right, sorry for the confusion.

> 
> > Said that, I think it's actually better to keep them, as they do things
> > useful.  For example,AFAICT this ->alloc_hpage() change can leak the hpage
> > alloacted for UMA already so that's why I think keeping them makes sense,
> > then iiuc the BUG_ON would trigger with UMA already.
> >
> > I saw that you've moved khugepaged_find_target_node() into this function
> > which looks definitely good, but if we could keep khugepaged_alloc_page()
> > and then IMHO we could even move khugepaged_find_target_node() into that
> > and drop the "node" parameter in khugepaged_alloc_page().
> >
> 
> I actually had done this, until commit 4272063db38c ("mm/khugepaged:
> sched to numa node when collapse huge page") which forced me to keep
> "node" visible in this function.

Right, I was looking at my local tree and that patch seems to be very
lately added into -next.

I'm curious why it wasn't applying to file thps too if it is worthwhile,
since if so that's also a suitable candidate to be moved into the same
hook.  I've asked in that thread instead.

Before that, feel free to keep your code as is.

> 
> > I'd even consider moving cc->gfp() all into it if possible, since the gfp
> > seems to be always with __GFP_THISNODE anyways.
> >
> 
> I would have liked to do this, but the gfp flags are referenced again
> when calling mem_cgroup_charge(), so I couldn't quite get rid of them
> from here.

Yeah, maybe we could move mem_cgroup_charge() into the hook too?  As below
codes are duplicated between file & anon and IMHO they're good candidate to
a new helper already anyway:

	/* Only allocate from the target node */
	gfp = alloc_hugepage_khugepaged_gfpmask() | __GFP_THISNODE;

	new_page = khugepaged_alloc_page(hpage, gfp, node);
	if (!new_page) {
		result = SCAN_ALLOC_HUGE_PAGE_FAIL;
		goto out;
	}

	if (unlikely(mem_cgroup_charge(page_folio(new_page), mm, gfp))) {
		result = SCAN_CGROUP_CHARGE_FAIL;
		goto out;
	}
	count_memcg_page_event(new_page, THP_COLLAPSE_ALLOC);

If we want to generalize it maybe we want to return the "result" instead of
the new page though, so the newpage can be passed in as a pointer.

There's one mmap_read_unlock(mm) for the anonymous code but I think that
can simply be moved above the chunk.

No strong opinion here, please feel free to think about what's the best
approach for landing this series.

[...]

> 
> I could (?) make .gfp of type gfp_t and just update it on every
> khugepaged scan (in case it changed) and also remove the gfp parameter
> for ->alloc_hpage().

If that's the case I'd prefer you keep you code as-is; gfp() is perfectly
fine and gfp() is light, I'm afraid that caching thing could make things
complicated.

Thanks,

-- 
Peter Xu





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux