On Tue, Sep 20, 2022 at 08:39:08AM +0800, eadavis@xxxxxxxx wrote: > From: Edward Adam Davis <eadavis@xxxxxxxx> > > On Mon, 19 Sep 2022 11:12:48 -0400, Daniel Jordan wrote: > > Hi Edward, > > > > On Mon, Sep 19, 2022 at 09:05:55AM +0800, eadavis@xxxxxxxx wrote: > > > From: Edward Adam Davis <eadavis@xxxxxxxx> > > > > > > Parallelized object serialization uses spin_unlock for unlocking a spin lock > > > that was previously locked with spin_lock. > > > > There's nothing unusual about that, though? > > > > > This caused the following lockdep warning about an inconsistent lock > > > state: > > > > > > inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage. > > > > Neither HARDIRQ-ON-W nor IN-HARDIRQ-W appear in the syzbot report, did > > you mean SOFTIRQ-ON-W and IN-SOFTIRQ-W? > Yes, I want say: inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. > > > > > We must use spin_lock_irqsave, because it is possible to trigger tipc > > > from an irq handler. > > > > A softirq handler, not a hardirq handler. I'd suggest using > > spin_lock_bh() instead of _irqsave in your patch. > I think _irqsave better than _bh, it can save the irq context, but _bh not, > and in tipc call trace contain SOFTIRQ-ON-W and IN-SOFTIRQ-W. _irqsave saving the context is about handling nested hardirq disables. It's not needed here since we don't need to care about disabling hardirq. _bh is for disabling softirq, a different context from hardirq. We want _bh here since the deadlock happens when a CPU takes the lock in both task and softirq context. padata uses _bh lock variants because it can be called in softirq context but not hardirq. Let's be consistent and do it in this case too. > > A Fixes tag would be helpful for stable and folks backporting this fix > > to understand what kernel versions are affected. > Yes, I will add it, thanks for your suggestion, and add Cc for the "Fixes" > owner. > > > > > WARNING: inconsistent lock state > > > 6.0.0-rc5-syzkaller-00025-g3245cb65fd91 #0 Not tainted > > > -------------------------------- > > > inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. > > > syz-executor.2/27685 [HC0[0]:SC1[1]:HE1:SE0] takes: > > > ffffe8ffffc7d280 (&pd_list->lock){+.?.}-{2:2}, at: spin_lock include/linux/spinlock.h:349 [inline] > > > ffffe8ffffc7d280 (&pd_list->lock){+.?.}-{2:2}, at: padata_do_serial+0x21e/0x4b0 kernel/padata.c:392 > > > {SOFTIRQ-ON-W} state was registered at: > > > lock_acquire kernel/locking/lockdep.c:5666 [inline] > > > lock_acquire+0x1ab/0x570 kernel/locking/lockdep.c:5631 > > > __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline] > > > _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:154 > > > spin_lock include/linux/spinlock.h:349 [inline] > > > padata_do_serial+0x21e/0x4b0 kernel/padata.c:392 > > > pcrypt_aead_enc+0x57/0x70 crypto/pcrypt.c:89 > > > padata_do_parallel+0x87b/0xa10 kernel/padata.c:217 > > > pcrypt_aead_encrypt+0x39f/0x4d0 crypto/pcrypt.c:117 > > > crypto_aead_encrypt+0xaa/0xf0 crypto/aead.c:94 > > > tls_do_encryption net/tls/tls_sw.c:529 [inline] > > > tls_push_record+0x13e8/0x3260 net/tls/tls_sw.c:762 > > > bpf_exec_tx_verdict+0xd82/0x11a0 net/tls/tls_sw.c:802 > > > tls_sw_sendmsg+0xa62/0x1820 net/tls/tls_sw.c:1014 > > > inet6_sendmsg+0x99/0xe0 net/ipv6/af_inet6.c:653 > > > sock_sendmsg_nosec net/socket.c:714 [inline] > > > sock_sendmsg+0xcf/0x120 net/socket.c:734 > > > sock_write_iter+0x291/0x3d0 net/socket.c:1108 > > > call_write_iter include/linux/fs.h:2187 [inline] > > > new_sync_write fs/read_write.c:491 [inline] > > > vfs_write+0x9e9/0xdd0 fs/read_write.c:578 > > > ksys_write+0x1e8/0x250 fs/read_write.c:631 > > > do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline] > > > __do_fast_syscall_32+0x65/0xf0 arch/x86/entry/common.c:178 > > > do_fast_syscall_32+0x2f/0x70 arch/x86/entry/common.c:203 > > > entry_SYSENTER_compat_after_hwframe+0x70/0x82 > > > irq event stamp: 740 > > > hardirqs last enabled at (740): [<ffffffff814919a0>] __local_bh_enable_ip+0xa0/0x120 kernel/softirq.c:401 > > > hardirqs last disabled at (739): [<ffffffff814919c3>] __local_bh_enable_ip+0xc3/0x120 kernel/softirq.c:378 > > > softirqs last enabled at (0): [<ffffffff8146f02e>] copy_process+0x213e/0x7090 kernel/fork.c:2202 > > > softirqs last disabled at (717): [<ffffffff81491843>] invoke_softirq kernel/softirq.c:445 [inline] > > > softirqs last disabled at (717): [<ffffffff81491843>] __irq_exit_rcu+0x123/0x180 kernel/softirq.c:650 > > > > > > other info that might help us debug this: > > > Possible unsafe locking scenario: > > > > > > CPU0 > > > ---- > > > lock(&pd_list->lock); > > > <Interrupt> > > > lock(&pd_list->lock); > > > > > > *** DEADLOCK *** > > > > > > 4 locks held by syz-executor.2/27685: > > > #0: ffff8880445f0460 (sb_writers#3){.+.+}-{0:0}, at: open_last_lookups fs/namei.c:3470 [inline] > > > #0: ffff8880445f0460 (sb_writers#3){.+.+}-{0:0}, at: path_openat+0x2613/0x28f0 fs/namei.c:3688 > > > #1: ffff8880465111a0 (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: inode_lock_shared include/linux/fs.h:766 [inline] > > > #1: ffff8880465111a0 (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: open_last_lookups fs/namei.c:3480 [inline] > > > #1: ffff8880465111a0 (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: path_openat+0x1514/0x28f0 fs/namei.c:3688 > > > #2: ffffc900001e0d70 ((&d->timer)){+.-.}-{0:0}, at: lockdep_copy_map include/linux/lockdep.h:31 [inline] > > > #2: ffffc900001e0d70 ((&d->timer)){+.-.}-{0:0}, at: call_timer_fn+0xd5/0x6b0 kernel/time/timer.c:1464 > > > #3: ffffffff8bf89400 (rcu_read_lock){....}-{1:2}, at: buf_msg net/tipc/msg.h:202 [inline] > > > #3: ffffffff8bf89400 (rcu_read_lock){....}-{1:2}, at: tipc_bearer_xmit_skb+0x8c/0x410 net/tipc/bearer.c:550 > > > > > > stack backtrace: > > > CPU: 1 PID: 27685 Comm: syz-executor.2 Not tainted 6.0.0-rc5-syzkaller-00025-g3245cb65fd91 #0 > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/26/2022 > > > Call Trace: > > > <IRQ> > > > __dump_stack lib/dump_stack.c:88 [inline] > > > dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 > > > print_usage_bug kernel/locking/lockdep.c:3961 [inline] > > > valid_state kernel/locking/lockdep.c:3973 [inline] > > > mark_lock_irq kernel/locking/lockdep.c:4176 [inline] > > > mark_lock.part.0.cold+0x18/0xd8 kernel/locking/lockdep.c:4632 > > > mark_lock kernel/locking/lockdep.c:4596 [inline] > > > mark_usage kernel/locking/lockdep.c:4527 [inline] > > > __lock_acquire+0x11d9/0x56d0 kernel/locking/lockdep.c:5007 > > > lock_acquire kernel/locking/lockdep.c:5666 [inline] > > > lock_acquire+0x1ab/0x570 kernel/locking/lockdep.c:5631 > > > __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline] > > > _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:154 > > > spin_lock include/linux/spinlock.h:349 [inline] > > > padata_do_serial+0x21e/0x4b0 kernel/padata.c:392 > > > pcrypt_aead_enc+0x57/0x70 crypto/pcrypt.c:89 > > > padata_do_parallel+0x87b/0xa10 kernel/padata.c:217 > > > pcrypt_aead_encrypt+0x39f/0x4d0 crypto/pcrypt.c:117 > > > crypto_aead_encrypt+0xaa/0xf0 crypto/aead.c:94 > > > tipc_aead_encrypt net/tipc/crypto.c:821 [inline] > > > tipc_crypto_xmit+0xf7a/0x2af0 net/tipc/crypto.c:1756 > > > tipc_bearer_xmit_skb+0x1ed/0x410 net/tipc/bearer.c:557 > > > tipc_disc_timeout+0x75e/0xcb0 net/tipc/discover.c:335 > > > call_timer_fn+0x1a0/0x6b0 kernel/time/timer.c:1474 > > > expire_timers kernel/time/timer.c:1519 [inline] > > > __run_timers.part.0+0x674/0xa80 kernel/time/timer.c:1790 > > > __run_timers kernel/time/timer.c:1768 [inline] > > > run_timer_softirq+0xb3/0x1d0 kernel/time/timer.c:1803 > > > __do_softirq+0x1d3/0x9c6 kernel/softirq.c:571 > > > invoke_softirq kernel/softirq.c:445 [inline] > > > __irq_exit_rcu+0x123/0x180 kernel/softirq.c:650 > > > irq_exit_rcu+0x5/0x20 kernel/softirq.c:662 > > > sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1106 > > > </IRQ> > > > > The changelog doesn't explain the problem or why the proposed solution > > fixes it. > > > > If I can read these splats right, it seems lockdep is complaining about > > how a task can take the reorder lock when softirqs are enabled > > (SOFTIRQ-ON-W) as in the tls_push_record() stack, but also when it's in > > softirq context (IN-SOFTIRQ-W), as in the tipc_disc_timeout() stack. So > > it should be enough to disable softirq here. > Yes, I agree with what you said earlier, But the softirq is already on before > the tipc_bearer_xmit_skb(), that is, (SOFTIRQ-ON-W) and (IN-SOFTIRQ-W) will > be included in the call trace of TIPC. I hope what I said above about hard vs soft irq clears this up. > > > > > Signed-off-by: Edward Adam Davis <eadavis@xxxxxxxx> > > > Reported-by: syzbot+bc05445bc14148d51915@xxxxxxxxxxxxxxxxxxxxxxxxx > > > --- > > > kernel/padata.c | 5 +++-- > > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > > > diff --git a/kernel/padata.c b/kernel/padata.c > > > index e5819bb8bd1d..38c7b17da796 100644 > > > --- a/kernel/padata.c > > > +++ b/kernel/padata.c > > > @@ -388,14 +388,15 @@ void padata_do_serial(struct padata_priv *padata) > > > int hashed_cpu = padata_cpu_hash(pd, padata->seq_nr); > > > struct padata_list *reorder = per_cpu_ptr(pd->reorder_list, hashed_cpu); > > > struct padata_priv *cur; > > > + unsigned long flags; > > > > > > - spin_lock(&reorder->lock); > > > + spin_lock_irqsave(&reorder->lock, flags); > > > /* Sort in ascending order of sequence number. */ > > > list_for_each_entry_reverse(cur, &reorder->list, list) > > > if (cur->seq_nr < padata->seq_nr) > > > break; > > > list_add(&padata->list, &cur->list); > > > - spin_unlock(&reorder->lock); > > > + spin_unlock_irqrestore(&reorder->lock, flags); > > > > > > /* > > > * Ensure the addition to the reorder list is ordered correctly > > > -- > > > 2.37.2 > > >