On Wed, Apr 27, 2022 at 05:51:53PM -0700, Zach O'Keefe wrote: > > > @@ -1110,13 +1115,14 @@ static void collapse_huge_page(struct mm_struct *mm, > > > */ > > > mmap_read_unlock(mm); > > > > > > + node = khugepaged_find_target_node(cc); > > > /* sched to specified node before huage page memory copy */ > > > if (task_node(current) != node) { > > > cpumask = cpumask_of_node(node); > > > if (!cpumask_empty(cpumask)) > > > set_cpus_allowed_ptr(current, cpumask); > > > } > > > - new_page = khugepaged_alloc_page(hpage, gfp, node); > > > + new_page = cc->alloc_hpage(cc, gfp, node); > > > > AFAICT you removed all references of khugepaged_alloc_page() in this patch, > > then you'd better drop the function for both NUMA and UMA. > > > > Sorry, I'm not sure I follow. In khugepaged context, logic WRT > khugepaged_alloc_page() is unchanged - it's just called indirectly > through ->alloc_hpage(). Ah you're right, sorry for the confusion. > > > Said that, I think it's actually better to keep them, as they do things > > useful. For example,AFAICT this ->alloc_hpage() change can leak the hpage > > alloacted for UMA already so that's why I think keeping them makes sense, > > then iiuc the BUG_ON would trigger with UMA already. > > > > I saw that you've moved khugepaged_find_target_node() into this function > > which looks definitely good, but if we could keep khugepaged_alloc_page() > > and then IMHO we could even move khugepaged_find_target_node() into that > > and drop the "node" parameter in khugepaged_alloc_page(). > > > > I actually had done this, until commit 4272063db38c ("mm/khugepaged: > sched to numa node when collapse huge page") which forced me to keep > "node" visible in this function. Right, I was looking at my local tree and that patch seems to be very lately added into -next. I'm curious why it wasn't applying to file thps too if it is worthwhile, since if so that's also a suitable candidate to be moved into the same hook. I've asked in that thread instead. Before that, feel free to keep your code as is. > > > I'd even consider moving cc->gfp() all into it if possible, since the gfp > > seems to be always with __GFP_THISNODE anyways. > > > > I would have liked to do this, but the gfp flags are referenced again > when calling mem_cgroup_charge(), so I couldn't quite get rid of them > from here. Yeah, maybe we could move mem_cgroup_charge() into the hook too? As below codes are duplicated between file & anon and IMHO they're good candidate to a new helper already anyway: /* Only allocate from the target node */ gfp = alloc_hugepage_khugepaged_gfpmask() | __GFP_THISNODE; new_page = khugepaged_alloc_page(hpage, gfp, node); if (!new_page) { result = SCAN_ALLOC_HUGE_PAGE_FAIL; goto out; } if (unlikely(mem_cgroup_charge(page_folio(new_page), mm, gfp))) { result = SCAN_CGROUP_CHARGE_FAIL; goto out; } count_memcg_page_event(new_page, THP_COLLAPSE_ALLOC); If we want to generalize it maybe we want to return the "result" instead of the new page though, so the newpage can be passed in as a pointer. There's one mmap_read_unlock(mm) for the anonymous code but I think that can simply be moved above the chunk. No strong opinion here, please feel free to think about what's the best approach for landing this series. [...] > > I could (?) make .gfp of type gfp_t and just update it on every > khugepaged scan (in case it changed) and also remove the gfp parameter > for ->alloc_hpage(). If that's the case I'd prefer you keep you code as-is; gfp() is perfectly fine and gfp() is light, I'm afraid that caching thing could make things complicated. Thanks, -- Peter Xu