On Tue, Mar 3, 2020 at 10:15 AM Shakeel Butt <shakeelb@xxxxxxxxxx> wrote: > > On Tue, Mar 3, 2020 at 9:47 AM Yang Shi <shy828301@xxxxxxxxx> wrote: > > > > On Tue, Mar 3, 2020 at 2:53 AM Tetsuo Handa > > <penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote: > > > > > > Hello, Naresh. > > > > > > > [ 98.003346] WARNING: CPU: 2 PID: 340 at > > > > include/linux/sched/mm.h:323 alloc_page_buffers+0x210/0x288 > > > > > > This is > > > > > > /** > > > * memalloc_use_memcg - Starts the remote memcg charging scope. > > > * @memcg: memcg to charge. > > > * > > > * This function marks the beginning of the remote memcg charging scope. All the > > > * __GFP_ACCOUNT allocations till the end of the scope will be charged to the > > > * given memcg. > > > * > > > * NOTE: This function is not nesting safe. > > > */ > > > static inline void memalloc_use_memcg(struct mem_cgroup *memcg) > > > { > > > WARN_ON_ONCE(current->active_memcg); > > > current->active_memcg = memcg; > > > } > > > > > > which is about memcg. Redirecting to linux-mm. > > > > Isn't this triggered by ("loop: use worker per cgroup instead of > > kworker") in linux-next, which converted loop driver to use worker per > > cgroup, so it may have multiple workers work at the mean time? > > > > So they may share the same "current", then it may cause kind of nested > > call to memalloc_use_memcg(). > > > > Could you please try the below debug patch? This is not the proper > > fix, but it may help us narrow down the problem. > > > > diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h > > index c49257a..1cc1cdc 100644 > > --- a/include/linux/sched/mm.h > > +++ b/include/linux/sched/mm.h > > @@ -320,6 +320,10 @@ static inline void > > memalloc_nocma_restore(unsigned int flags) > > */ > > static inline void memalloc_use_memcg(struct mem_cgroup *memcg) > > { > > + if ((current->flags & PF_KTHREAD) && > > + current->active_memcg) > > + return; > > + > > WARN_ON_ONCE(current->active_memcg); > > current->active_memcg = memcg; > > } > > > > Maybe it's time to make memalloc_use_memcg() nesting safe. Need handle the below case: CPU A CPU B memalloc_use_memcg memalloc_use_memcg memalloc_unuse_memcg memalloc_unuse_memcg They may manipulate the same task->active_memcg, so CPU B may still see wrong memcg, and the last call to memalloc_unuse_memcg() on CPU B may not restore active_memcg to NULL. And, some code depends on correct active_memcg.