On Tue, Mar 2, 2021 at 4:44 PM Michal Hocko <mhocko@xxxxxxxx> wrote: > > On Tue 02-03-21 15:37:33, Muchun Song wrote: > > The alloc_thread_stack_node() cannot guarantee that allocated stack pages > > are in the same node when CONFIG_VMAP_STACK. Because we do not specify > > __GFP_THISNODE to __vmalloc_node_range(). Fix it by caling > > mod_lruvec_page_state() for each page one by one. > > What is the actual problem you are trying to address by this patch? > 991e7673859e has deliberately dropped the per page accounting. Can you > explain why that was incorrect? There surely is some imprecision > involved but does it matter and is it even observable? When I read the code of account_kernel_stack(), I see a comment that says "All stack pages are in the same node". I am confused about this. IIUC, there is no guarantee about this. Right? Yeah, imprecision may not be a problem. But if this is what we did deliberately, I think that it is better to add a comment there. Thanks. > > > Fixes: 991e7673859e ("mm: memcontrol: account kernel stack per node") > > Signed-off-by: Muchun Song <songmuchun@xxxxxxxxxxxxx> > > --- > > kernel/fork.c | 15 ++++++++++----- > > 1 file changed, 10 insertions(+), 5 deletions(-) > > > > diff --git a/kernel/fork.c b/kernel/fork.c > > index d66cd1014211..6e2201feb524 100644 > > --- a/kernel/fork.c > > +++ b/kernel/fork.c > > @@ -379,14 +379,19 @@ static void account_kernel_stack(struct task_struct *tsk, int account) > > void *stack = task_stack_page(tsk); > > struct vm_struct *vm = task_stack_vm_area(tsk); > > > > + if (vm) { > > + int i; > > > > - /* All stack pages are in the same node. */ > > - if (vm) > > - mod_lruvec_page_state(vm->pages[0], NR_KERNEL_STACK_KB, > > - account * (THREAD_SIZE / 1024)); > > - else > > + BUG_ON(vm->nr_pages != THREAD_SIZE / PAGE_SIZE); > > + > > + for (i = 0; i < THREAD_SIZE / PAGE_SIZE; i++) > > + mod_lruvec_page_state(vm->pages[i], NR_KERNEL_STACK_KB, > > + account * (PAGE_SIZE / 1024)); > > + } else { > > + /* All stack pages are in the same node. */ > > mod_lruvec_kmem_state(stack, NR_KERNEL_STACK_KB, > > account * (THREAD_SIZE / 1024)); > > + } > > } > > > > static int memcg_charge_kernel_stack(struct task_struct *tsk) > > -- > > 2.11.0 > > -- > Michal Hocko > SUSE Labs