On Tue, Sep 07, 2021 at 04:13:07PM +0200, Marco Elver wrote: > Shuah Khan reported: > > | When CONFIG_PROVE_RAW_LOCK_NESTING=y and CONFIG_KASAN are enabled, > | kasan_record_aux_stack() runs into "BUG: Invalid wait context" when > | it tries to allocate memory attempting to acquire spinlock in page > | allocation code while holding workqueue pool raw_spinlock. > | > | There are several instances of this problem when block layer tries > | to __queue_work(). Call trace from one of these instances is below: > | > | kblockd_mod_delayed_work_on() > | mod_delayed_work_on() > | __queue_delayed_work() > | __queue_work() (rcu_read_lock, raw_spin_lock pool->lock held) > | insert_work() > | kasan_record_aux_stack() > | kasan_save_stack() > | stack_depot_save() > | alloc_pages() > | __alloc_pages() > | get_page_from_freelist() > | rm_queue() > | rm_queue_pcplist() > | local_lock_irqsave(&pagesets.lock, flags); > | [ BUG: Invalid wait context triggered ] > > The default kasan_record_aux_stack() calls stack_depot_save() with > GFP_NOWAIT, which in turn can then call alloc_pages(GFP_NOWAIT, ...). > In general, however, it is not even possible to use either GFP_ATOMIC > nor GFP_NOWAIT in certain non-preemptive contexts, including > raw_spin_locks (see gfp.h and ab00db216c9c7). > > Fix it by instructing stackdepot to not expand stack storage via > alloc_pages() in case it runs out by using kasan_record_aux_stack_noalloc(). > > While there is an increased risk of failing to insert the stack trace, > this is typically unlikely, especially if the same insertion had already > succeeded previously (stack depot hit). For frequent calls from the same > location, it therefore becomes extremely unlikely that > kasan_record_aux_stack_noalloc() fails. > > Link: https://lkml.kernel.org/r/20210902200134.25603-1-skhan@xxxxxxxxxxxxxxxxxxx > Reported-by: Shuah Khan <skhan@xxxxxxxxxxxxxxxxxxx> > Signed-off-by: Marco Elver <elver@xxxxxxxxxx> Acked-by: Tejun Heo <tj@xxxxxxxxxx> Thanks. -- tejun