On Fri, 10 Sept 2021 at 17:28, Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> wrote: > On 2021-09-10 12:50:51 [+0200], Vlastimil Babka wrote: > > > Thank you. Tested all the 6 patches in this series on Linux 5.14. This problem > > > exists in 5.13 and needs to be marked for both 5.14 and 5.13 stable releases. > > > > I think if this problem manifests only with CONFIG_PROVE_RAW_LOCK_NESTING > > then it shouldn't be backported to stable. CONFIG_PROVE_RAW_LOCK_NESTING is > > an experimental/development option to earlier discover what will collide > > with RT lock semantics, without needing the full RT tree. > > Thus, good to fix going forward, but not necessary to stable backport. > > Acked-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> > for the series. Thank you. Thank you. I'll send v2 with Acks/Tested-by added and the comment addition you suggested. > As for the backport I agree here with Vlastimil. > > I pulled it into my RT tree for some testing and it looked good. I had > to > --- a/kernel/rcu/tree.c > +++ b/kernel/rcu/tree.c > @@ -3030,7 +3030,7 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func) > head->func = func; > head->next = NULL; > local_irq_save(flags); > - kasan_record_aux_stack(head); > + kasan_record_aux_stack_noalloc(head); > rdp = this_cpu_ptr(&rcu_data); > > /* Add the callback to our list. */ > > We could move kasan_record_aux_stack() before that local_irq_save() but > then call_rcu() can be called preempt-disabled section so we would have > the same problem. > > The second warning came from kasan_quarantine_remove_cache(). At the end > per_cpu_remove_cache() -> qlist_free_all() will free memory with > disabled interrupts (due to that smp-function call). > Moving it to kworker would solve the problem. I don't mind keeping that > smp_function call assuming that it is all debug-code and it increases > overall latency anyway. But then could we maybe move all those objects > to a single list which freed after on_each_cpu()? The quarantine is per-CPU, and I think what you suggest would fundamentally change its design. If you have something that works on RT without a fundamental change would be ideal (it is all debug code and not used on non-KASAN kernels). > Otherwise I haven't seen any new warnings showing up with KASAN enabled. > > Sebastian