greearb@ wrote: > From: Ben Greear <greearb@xxxxxxxxxxxxxxx> > > This reverts commit 504fc6f4f7f681d2a03aa5f68aad549d90eab853. > > dev_queue_xmit_nit needs to run with bh locking, otherwise > it conflicts with packets coming in from a nic in softirq > context and packets being transmitted from user context. > > ================================ > WARNING: inconsistent lock state > 6.11.0 #1 Tainted: G W > -------------------------------- > inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. > btserver/134819 [HC0[0]:SC0[0]:HE1:SE1] takes: > ffff8882da30c118 (rlock-AF_PACKET){+.?.}-{2:2}, at: tpacket_rcv+0x863/0x3b30 > {IN-SOFTIRQ-W} state was registered at: > lock_acquire+0x19a/0x4f0 > _raw_spin_lock+0x27/0x40 > packet_rcv+0xa33/0x1320 > __netif_receive_skb_core.constprop.0+0xcb0/0x3a90 > __netif_receive_skb_list_core+0x2c9/0x890 > netif_receive_skb_list_internal+0x610/0xcc0 > napi_complete_done+0x1c0/0x7c0 > igb_poll+0x1dbb/0x57e0 [igb] > __napi_poll.constprop.0+0x99/0x430 > net_rx_action+0x8e7/0xe10 > handle_softirqs+0x1b7/0x800 > __irq_exit_rcu+0x91/0xc0 > irq_exit_rcu+0x5/0x10 > [snip] > > other info that might help us debug this: > Possible unsafe locking scenario: > > CPU0 > ---- > lock(rlock-AF_PACKET); > <Interrupt> > lock(rlock-AF_PACKET); > > *** DEADLOCK *** > > Call Trace: > <TASK> > dump_stack_lvl+0x73/0xa0 > mark_lock+0x102e/0x16b0 > __lock_acquire+0x9ae/0x6170 > lock_acquire+0x19a/0x4f0 > _raw_spin_lock+0x27/0x40 > tpacket_rcv+0x863/0x3b30 > dev_queue_xmit_nit+0x709/0xa40 > vrf_finish_direct+0x26e/0x340 [vrf] > vrf_l3_out+0x5f4/0xe80 [vrf] > __ip_local_out+0x51e/0x7a0 > [snip] > > Fixes: 504fc6f4f7f6 ("vrf: Remove unnecessary RCU-bh critical section") > Link: https://lore.kernel.org/netdev/05765015-f727-2f30-58da-2ad6fa7ea99f@xxxxxxxxxxxxxxx/T/ > > Signed-off-by: Ben Greear <greearb@xxxxxxxxxxxxxxx> Please Cc: all previous reviewers and folks who participated in the discussion. I entirely missed this. No need to add as Cc tags, just --cc in git send-email will do. > --- > > v2: Edit patch description. > > drivers/net/vrf.c | 2 ++ > net/core/dev.c | 1 + > 2 files changed, 3 insertions(+) > > diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c > index 4d8ccaf9a2b4..4087f72f0d2b 100644 > --- a/drivers/net/vrf.c > +++ b/drivers/net/vrf.c > @@ -608,7 +608,9 @@ static void vrf_finish_direct(struct sk_buff *skb) > eth_zero_addr(eth->h_dest); > eth->h_proto = skb->protocol; > > + rcu_read_lock_bh(); > dev_queue_xmit_nit(skb, vrf_dev); > + rcu_read_unlock_bh(); > > skb_pull(skb, ETH_HLEN); > } > diff --git a/net/core/dev.c b/net/core/dev.c > index cd479f5f22f6..566e69a38eed 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -2285,6 +2285,7 @@ EXPORT_SYMBOL_GPL(dev_nit_active); > /* > * Support routine. Sends outgoing frames to any network > * taps currently in use. > + * BH must be disabled before calling this. Unnecessary. Especially patches for stable should be minimal to reduce the chance of conflicts.