On Tue, Feb 4, 2025 at 10:30 PM Jakub Kicinski <kuba@xxxxxxxxxx> wrote: > > On Tue, 4 Feb 2025 13:17:08 -0800 Paul E. McKenney wrote: > > > > TBH I'm slightly confused by this, and the previous warnings. > > > > > > > > The previous one was from a timer callback. > > > > > > > > This one is with BH disabled. > > > > > > > > I thought BH implies RCU protection. We certainly depend on that > > > > in NAPI for XDP. And threaded NAPI does the exact same thing as > > > > xfrm_trans_reinject(), a bare local_bh_disable(). > > > > > > > > RCU folks, did something change or is just holes in my brain again? > > > > > > Nope, BH does not imply rcu_read_lock() > > > > You are both right? ;-) > > > > The synchronize_rcu() function will wait for all types of RCU readers, > > including BH-disabled regions of code. However, lockdep can distinguish > > between the various sorts of readers. So for example > > > > lockdep_assert_in_rcu_read_lock_bh(); > > > > will complain unless you did rcu_read_lock_bh(), even if you did something > > like disable_bh(). If you don't want to distinguish and are happy with > > any type of RCU reader, you can use > > > > lockdep_assert_in_rcu_reader(); > > > > I have been expecting that CONFIG_PREEMPT_RT=y kernels will break this > > any day now, but so far so good. ;-) > > Thanks Paul! So IIUC in this case we could: > > diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h > index 0f5eb9db0c62..58ec1eb9ae6a 100644 > --- a/include/net/net_namespace.h > +++ b/include/net/net_namespace.h > @@ -401,7 +401,7 @@ static inline struct net *read_pnet(const possible_net_t *pnet) > static inline struct net *read_pnet_rcu(possible_net_t *pnet) > { > #ifdef CONFIG_NET_NS > - return rcu_dereference(pnet->net); > + return rcu_dereference_check(pnet->net, rcu_read_lock_bh_held()); > #else > return &init_net; > #endif > > Sorry for the sideline, Eric, up to you how to proceed.. I will squash this diff to the following iteration, and keep rcu_dereference() Note that nf_hook() also grabs rcu_read_lock(). diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c index 4030527ebe098e86764f37c9068d2f2f9af2d183..ee5a69fdc67a30e55c5a073455e1d7299f168f34 100644 --- a/net/ipv6/ip6_input.c +++ b/net/ipv6/ip6_input.c @@ -477,9 +477,7 @@ void ip6_protocol_deliver_rcu(struct net *net, struct sk_buff *skb, int nexthdr, static int ip6_input_finish(struct net *net, struct sock *sk, struct sk_buff *skb) { skb_clear_delivery_time(skb); - rcu_read_lock(); ip6_protocol_deliver_rcu(net, skb, 0, false); - rcu_read_unlock(); return 0; } @@ -487,9 +485,14 @@ static int ip6_input_finish(struct net *net, struct sock *sk, struct sk_buff *sk int ip6_input(struct sk_buff *skb) { - return NF_HOOK(NFPROTO_IPV6, NF_INET_LOCAL_IN, - dev_net_rcu(skb->dev), NULL, skb, skb->dev, NULL, - ip6_input_finish); + int res; + + rcu_read_lock(); + res = NF_HOOK(NFPROTO_IPV6, NF_INET_LOCAL_IN, + dev_net_rcu(skb->dev), NULL, skb, skb->dev, NULL, + ip6_input_finish); + rcu_read_unlock(); + return res; } EXPORT_SYMBOL_GPL(ip6_input);