<snip> >However since there are 800 threads, I see there might be an opportunity >for another thread to enter in the next 'numa_scan_period' while >we have not gotten till numab_state allocation. > >There should be simpler ways to overcome like Vlastimil already pointed >in the other thread, and having lock is an overkill. > >for e.g., >numab_state = kzalloc(..) > >if we see that some other thread able to successfully assign >vma->numab_state with their allocation (with cmpxchg), simply >free your allocation. > >Can you please check if my understanding is correct? Thanks for Vlastimil's and Raghu's reviews and comments. Yes, your understanding is correct. Before submitting this patch, I had two internal proposals: lock and cmpxchg. Here is the my cmpxchg version (Test passed). If you're ok with this cmpxchg version, may I have your reviewed-by? I'll send a v2 then. --- diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 3356315d7e64..7f99df294583 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3399,10 +3399,16 @@ static void task_numa_work(struct callback_head *work) /* Initialise new per-VMA NUMAB state. */ if (!vma->numab_state) { - vma->numab_state = kzalloc(sizeof(struct vma_numab_state), - GFP_KERNEL); - if (!vma->numab_state) + struct vma_numab_state *ptr; + + ptr = kzalloc(sizeof(*ptr), GFP_KERNEL); + if (!ptr) + continue; + + if (cmpxchg(&vma->numab_state, NULL, ptr)) { + kfree(ptr); continue; + } vma->numab_state->start_scan_seq = mm->numa_scan_seq;