On 11.03.22 10:01, Bibo Mao wrote: > collapse huge page is slow, specially when khugepaged daemon runs > on different numa node with that of huge page. It suffers from > huge page copying across nodes, also cache is not used for target > node. With this patch, khugepaged daemon switches to the same numa > node with huge page. It saves copying time and makes use of local > cache better. Hi, just the usual question, do you have any performance numbers to back your claims (e.g., "is slow, specially when") and proof that this patch does the trick? > > Signed-off-by: Bibo Mao <maobibo@xxxxxxxxxxx> > --- > mm/khugepaged.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index 131492fd1148..460c285dc974 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -116,6 +116,7 @@ struct khugepaged_scan { > struct list_head mm_head; > struct mm_slot *mm_slot; > unsigned long address; > + int node; > }; > > static struct khugepaged_scan khugepaged_scan = { > @@ -1066,6 +1067,7 @@ static void collapse_huge_page(struct mm_struct *mm, > struct vm_area_struct *vma; > struct mmu_notifier_range range; > gfp_t gfp; > + const struct cpumask *cpumask; We tend to stick to reverse Christmas tree format as good as possible. > > VM_BUG_ON(address & ~HPAGE_PMD_MASK); > > @@ -1079,6 +1081,13 @@ static void collapse_huge_page(struct mm_struct *mm, > * that. We will recheck the vma after taking it again in write mode. > */ > mmap_read_unlock(mm); > + > + /* sched to specified node before huage page memory copy */ s/huage/huge/ > + cpumask = cpumask_of_node(node); > + if ((khugepaged_scan.node != node) && !cpumask_empty(cpumask)) { > + set_cpus_allowed_ptr(current, cpumask); > + khugepaged_scan.node = node; > + } > new_page = khugepaged_alloc_page(hpage, gfp, node); > if (!new_page) { > result = SCAN_ALLOC_HUGE_PAGE_FAIL; > @@ -2380,6 +2389,7 @@ int start_stop_khugepaged(void) > kthread_stop(khugepaged_thread); > khugepaged_thread = NULL; > } > + khugepaged_scan.node = NUMA_NO_NODE; > set_recommended_min_free_kbytes(); > fail: > mutex_unlock(&khugepaged_mutex); -- Thanks, David / dhildenb