On Wed, Jun 05, 2024 at 09:08:33PM +0200, Florian Westphal wrote: > Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote: > > [ CC Willem ] > > > On Wed, Jun 05, 2024 at 08:14:50PM +0200, Florian Westphal wrote: > > > Christoph Paasch <cpaasch@xxxxxxxxx> wrote: > > > > > Reported-by: Christoph Paasch <cpaasch@xxxxxxxxx> > > > > > Suggested-by: Paolo Abeni <pabeni@xxxxxxxxxx> > > > > > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/494 > > > > > Signed-off-by: Florian Westphal <fw@xxxxxxxxx> > > > > > > > > I just gave this one a shot in my syzkaller instances and am still hitting the issue. > > > > > > No, different bug, this patch is correct. > > > > > > I refuse to touch the flow dissector. > > > > I see callers of ip_local_out() in the tree which do not set skb->dev. > > > > I don't understand this: > > > > bool __skb_flow_dissect(const struct net *net, > > const struct sk_buff *skb, > > struct flow_dissector *flow_dissector, > > void *target_container, const void *data, > > __be16 proto, int nhoff, int hlen, unsigned int flags) > > { > > [...] > > WARN_ON_ONCE(!net); > > if (net) { > > > > it was added by 9b52e3f267a6 ("flow_dissector: handle no-skb use case") > > > > Is this WARN_ON_ONCE() bogus? > > When this was added (handle dissection from bpf prog, per netns), the correct > solution would have been to pass 'struct net' explicitly via skb_get_hash() > and all variants. As that was likely deemed to be too much code churn it > tries to infer struct net via skb->{dev,sk}. > > So there are several options here: > 1. remove the WARN_ON_ONCE and be done with it > 2. remove the WARN_ON_ONCE and pretend net was init_net > 3. also look at skb_dst(skb)->dev if skb->dev is unset, then back to 1) > or 2) > 4. stop using skb_get_hash() from netfilter (but there are likely other > callers that might hit this). > 5. fix up callers, one by one > 6. assign skb->dev inside netfilter if its unset > > 3 and 2 combined are probably going to be the least invasive. > > 5 might take some time, we now know two, namely tcp resets generated > from netfilter and igmp_send_report(). No idea if there are more. Quickly browsing, synproxy and tee also calls ip_local_out() too. icmp_send() which is used, eg. to send destination unreachable too to reset. There is also __skb_get_hash_symmetric() that could hit this from nft_hash? No idea what more callers need to be adjusted to remove this splat, that was a cursory tree review. And ip_output() sets on skb->dev from postrouting path, then if callers are fixed, then skb->dev would be once then again from output path?