On Wed 03-03-21 17:39:56, Muchun Song wrote: > For simplification 991e7673859e ("mm: memcontrol: account kernel stack > per node") has changed the per zone vmalloc backed stack pages > accounting to per node. By doing that we have lost a certain precision > because those pages might live in different NUMA nodes. In the end > NR_KERNEL_STACK_KB exported to the userspace might be over estimated on > some nodes while underestimated on others. > > This doesn't impose any real problem to correctnes of the kernel > behavior as the counter is not used for any internal processing but it > can cause some confusion to the userspace. You have skipped over one part of the changelog I have proposed and that is to provide an actual data. > Address the problem by accounting each vmalloc backing page to its own > node. > > Fixes: 991e7673859e ("mm: memcontrol: account kernel stack per node") Fixes tag might make somebody assume this is worth backporting but I highly doubt so. > Signed-off-by: Muchun Song <songmuchun@xxxxxxxxxxxxx> > Reviewed-by: Shakeel Butt <shakeelb@xxxxxxxxxx> Anyway Acked-by: Michal Hocko <mhocko@xxxxxxxx> as the patch is correct with one comment below > --- > Changelog in v2: > - Rework commit log suggested by Michal. > > Thanks to Michal and Shakeel for review. > > kernel/fork.c | 15 ++++++++++----- > 1 file changed, 10 insertions(+), 5 deletions(-) > > diff --git a/kernel/fork.c b/kernel/fork.c > index d66cd1014211..6e2201feb524 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -379,14 +379,19 @@ static void account_kernel_stack(struct task_struct *tsk, int account) > void *stack = task_stack_page(tsk); > struct vm_struct *vm = task_stack_vm_area(tsk); > > + if (vm) { > + int i; > > - /* All stack pages are in the same node. */ > - if (vm) > - mod_lruvec_page_state(vm->pages[0], NR_KERNEL_STACK_KB, > - account * (THREAD_SIZE / 1024)); > - else > + BUG_ON(vm->nr_pages != THREAD_SIZE / PAGE_SIZE); I do not think we need this BUG_ON. What kind of purpose does it serve? > + > + for (i = 0; i < THREAD_SIZE / PAGE_SIZE; i++) > + mod_lruvec_page_state(vm->pages[i], NR_KERNEL_STACK_KB, > + account * (PAGE_SIZE / 1024)); > + } else { > + /* All stack pages are in the same node. */ > mod_lruvec_kmem_state(stack, NR_KERNEL_STACK_KB, > account * (THREAD_SIZE / 1024)); > + } > } > > static int memcg_charge_kernel_stack(struct task_struct *tsk) > -- > 2.11.0 -- Michal Hocko SUSE Labs