__dev_queue_xmit(), which is our caller, does run under rcu_read_lock_bh(), but in my foolishness I had thought this would be enough to make the access, lockdep complains that rcu_read_lock() is not held. Which it isn't - as it turns out, RCU preempt and RCU-bh are two different flavors, and although Paul McKenney has consolidated synchronize_rcu() to wait for both preempt as well as bh read-side critical sections [1], the reader-side API is different, the lockdep maps and keys are different. The bridge calls synchronize_rcu() in br_vlan_flush(), and this does wait for our TX fastpath reader of the br_vlan_group_rcu to complete even though it is in an rcu-bh read side section. So even though this is in premise safe, to lockdep this is a case of "who are you? I don't know you, you're suspicious". Side note, I still don't really understand the different RCU flavors. For example, as far as I can see, the core network stack has never directly called synchronize_rcu_bh, not even once. Just the initial synchronize_kernel(), replaced later with the RCU preempt variant - synchronize_rcu(). Very very long story short, dev_queue_xmit has started calling this exact variant - rcu_read_lock_bh() - since [2], to make dev_deactivate properly wait for network interfaces with NETIF_F_LLTX to finish their dev_queue_xmit(). But that relied on an existing synchronize_rcu(), not synchronize_rcu_bh(). So does this mean that synchronize_net() never really waited for the rcu-bh critical section in dev_queue_xmit to finish? I've no idea. So basically there are multiple options. First would be to duplicate br_vlan_get_pvid_rcu() into a new br_vlan_get_pvid_rcu_bh() to appease lockdep for the TX path case. But this function already has another brother, br_vlan_get_pvid(), which is protected by the update-side rtnl_mutex. We don't want to grow the family too big too, especially since br_vlan_get_pvid_rcu_bh() would not be a function used by the bridge at all, just exported by it and used by the DSA layer. The option of getting to the bottom of why does __dev_queue_xmit use rcu-bh, and splitting that into local_bh_disable + rcu_read_lock, as it was before [3], might be impractical. There have been 15 years of development since then, and there are lots of code paths that use rcu_dereference_bh() in the TX path. Plus, with the consolidation work done in [1], I'm not even sure what are the practical benefits of rcu-bh any longer, if the whole point was for synchronize_rcu() to wait for everything in sight - how can spammy softirqs like networking paint themselves red any longer, and how can certain RCU updaters not wait for them now, in order to avoid denial of service? It doesn't appear possible from the distance from which I'm looking at the problem. So the effort of converting __dev_queue_xmit from rcu-bh to rcu-preempt would only appear justified if it went together with the complete elimination of rcu-bh. Also, it would appear to be quite a strange and roundabout way to fix a "suspicious RCU usage" lockdep message. Last, it appears possible to just give lockdep what it wants, and hold an rcu-preempt read-side critical section when calling br_vlan_get_pvid_rcu from the TX path. In terms of lines of code and amount of thought needed it is certainly the easiest path forward, even though it incurs a small (negligible) performance overhead (and avoidable, at that). This is what this patch does, in lack of a deeper understanding of lockdep, RCU or the network transmission process. [1] https://lwn.net/Articles/777036/ [2] commit d4828d85d188 ("[NET]: Prevent transmission after dev_deactivate") [3] commit 43da55cbd54e ("[NET]: Do less atomic count changes in dev_queue_xmit.") Fixes: d82f8ab0d874 ("net: dsa: tag_dsa: offload the bridge forwarding process") Signed-off-by: Vladimir Oltean <vladimir.oltean@xxxxxxx> --- net/dsa/tag_dsa.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/net/dsa/tag_dsa.c b/net/dsa/tag_dsa.c index 77d0ce89ab77..178464cd2bdb 100644 --- a/net/dsa/tag_dsa.c +++ b/net/dsa/tag_dsa.c @@ -150,10 +150,9 @@ static struct sk_buff *dsa_xmit_ll(struct sk_buff *skb, struct net_device *dev, * that's where the packets ingressed from. */ if (!br_vlan_enabled(br)) { - /* Safe because __dev_queue_xmit() runs under - * rcu_read_lock_bh() - */ + rcu_read_lock(); err = br_vlan_get_pvid_rcu(br, &pvid); + rcu_read_unlock(); if (err) return NULL; } -- 2.25.1