On Thu, Sep 26, 2024 at 11:19:53AM -0700, Ben Greear wrote: > On 9/26/24 02:03, Willem de Bruijn wrote: > > greearb@ wrote: > > > From: Ben Greear <greearb@xxxxxxxxxxxxxxx> > > > > > > This reverts commit 504fc6f4f7f681d2a03aa5f68aad549d90eab853. > > > > > > dev_queue_xmit_nit needs to run with bh locking, otherwise > > > it conflicts with packets coming in from a nic in softirq > > > context and packets being transmitted from user context. > > > > > > ================================ > > > WARNING: inconsistent lock state > > > 6.11.0 #1 Tainted: G W > > > -------------------------------- > > > inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. > > > btserver/134819 [HC0[0]:SC0[0]:HE1:SE1] takes: > > > ffff8882da30c118 (rlock-AF_PACKET){+.?.}-{2:2}, at: tpacket_rcv+0x863/0x3b30 > > > {IN-SOFTIRQ-W} state was registered at: > > > lock_acquire+0x19a/0x4f0 > > > _raw_spin_lock+0x27/0x40 > > > packet_rcv+0xa33/0x1320 > > > __netif_receive_skb_core.constprop.0+0xcb0/0x3a90 > > > __netif_receive_skb_list_core+0x2c9/0x890 > > > netif_receive_skb_list_internal+0x610/0xcc0 > > > napi_complete_done+0x1c0/0x7c0 > > > igb_poll+0x1dbb/0x57e0 [igb] > > > __napi_poll.constprop.0+0x99/0x430 > > > net_rx_action+0x8e7/0xe10 > > > handle_softirqs+0x1b7/0x800 > > > __irq_exit_rcu+0x91/0xc0 > > > irq_exit_rcu+0x5/0x10 > > > [snip] > > > > > > other info that might help us debug this: > > > Possible unsafe locking scenario: > > > > > > CPU0 > > > ---- > > > lock(rlock-AF_PACKET); > > > <Interrupt> > > > lock(rlock-AF_PACKET); > > > > > > *** DEADLOCK *** > > > > > > Call Trace: > > > <TASK> > > > dump_stack_lvl+0x73/0xa0 > > > mark_lock+0x102e/0x16b0 > > > __lock_acquire+0x9ae/0x6170 > > > lock_acquire+0x19a/0x4f0 > > > _raw_spin_lock+0x27/0x40 > > > tpacket_rcv+0x863/0x3b30 > > > dev_queue_xmit_nit+0x709/0xa40 > > > vrf_finish_direct+0x26e/0x340 [vrf] > > > vrf_l3_out+0x5f4/0xe80 [vrf] > > > __ip_local_out+0x51e/0x7a0 > > > [snip] > > > > > > Fixes: 504fc6f4f7f6 ("vrf: Remove unnecessary RCU-bh critical section") > > > Link: https://lore.kernel.org/netdev/05765015-f727-2f30-58da-2ad6fa7ea99f@xxxxxxxxxxxxxxx/T/ > > > > > > Signed-off-by: Ben Greear <greearb@xxxxxxxxxxxxxxx> > > > > Please Cc: all previous reviewers and folks who participated in the > > discussion. I entirely missed this. No need to add as Cc tags, just > > --cc in git send-email will do. > > > > > --- > > > > > > v2: Edit patch description. > > > > > > drivers/net/vrf.c | 2 ++ > > > net/core/dev.c | 1 + > > > 2 files changed, 3 insertions(+) > > > > > > diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c > > > index 4d8ccaf9a2b4..4087f72f0d2b 100644 > > > --- a/drivers/net/vrf.c > > > +++ b/drivers/net/vrf.c > > > @@ -608,7 +608,9 @@ static void vrf_finish_direct(struct sk_buff *skb) > > > eth_zero_addr(eth->h_dest); > > > eth->h_proto = skb->protocol; > > > + rcu_read_lock_bh(); > > > dev_queue_xmit_nit(skb, vrf_dev); > > > + rcu_read_unlock_bh(); > > > skb_pull(skb, ETH_HLEN); > > > } > > > diff --git a/net/core/dev.c b/net/core/dev.c > > > index cd479f5f22f6..566e69a38eed 100644 > > > --- a/net/core/dev.c > > > +++ b/net/core/dev.c > > > @@ -2285,6 +2285,7 @@ EXPORT_SYMBOL_GPL(dev_nit_active); > > > /* > > > * Support routine. Sends outgoing frames to any network > > > * taps currently in use. > > > + * BH must be disabled before calling this. > > > > Unnecessary. Especially patches for stable should be minimal to > > reduce the chance of conflicts. > > I was asked to add this, and also asked to CC stable. <formletter> This is not the correct way to submit patches for inclusion in the stable kernel tree. Please read: https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html for how to do this properly. </formletter>