Re: [PATCH 0/6] stackdepot, kasan, workqueue: Avoid expanding stackdepot slabs when holding raw_spin_lock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 10 Sept 2021 at 17:28, Sebastian Andrzej Siewior
<bigeasy@xxxxxxxxxxxxx> wrote:
> On 2021-09-10 12:50:51 [+0200], Vlastimil Babka wrote:
> > > Thank you. Tested all the 6 patches in this series on Linux 5.14. This problem
> > > exists in 5.13 and needs to be marked for both 5.14 and 5.13 stable releases.
> >
> > I think if this problem manifests only with CONFIG_PROVE_RAW_LOCK_NESTING
> > then it shouldn't be backported to stable. CONFIG_PROVE_RAW_LOCK_NESTING is
> > an experimental/development option to earlier discover what will collide
> > with RT lock semantics, without needing the full RT tree.
> > Thus, good to fix going forward, but not necessary to stable backport.
>
>   Acked-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
> for the series. Thank you.

Thank you. I'll send v2 with Acks/Tested-by added and the comment
addition you suggested.

> As for the backport I agree here with Vlastimil.
>
> I pulled it into my RT tree for some testing and it looked good. I had
> to
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3030,7 +3030,7 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func)
>         head->func = func;
>         head->next = NULL;
>         local_irq_save(flags);
> -       kasan_record_aux_stack(head);
> +       kasan_record_aux_stack_noalloc(head);
>         rdp = this_cpu_ptr(&rcu_data);
>
>         /* Add the callback to our list. */
>
> We could move kasan_record_aux_stack() before that local_irq_save() but
> then call_rcu() can be called preempt-disabled section so we would have
> the same problem.
>
> The second warning came from kasan_quarantine_remove_cache(). At the end
> per_cpu_remove_cache() -> qlist_free_all() will free memory with
> disabled interrupts (due to that smp-function call).
> Moving it to kworker would solve the problem. I don't mind keeping that
> smp_function call assuming that it is all debug-code and it increases
> overall latency anyway. But then could we maybe move all those objects
> to a single list which freed after on_each_cpu()?

The quarantine is per-CPU, and I think what you suggest would
fundamentally change its design. If you have something that works on
RT without a fundamental change would be ideal (it is all debug code
and not used on non-KASAN kernels).


> Otherwise I haven't seen any new warnings showing up with KASAN enabled.
>
> Sebastian




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux