Hi Willem, On Wed, Jun 05, 2024 at 09:54:59PM -0400, Willem de Bruijn wrote: > On Wed, Jun 5, 2024 at 6:16 PM Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote: > > > > On Wed, Jun 05, 2024 at 05:38:00PM -0400, Willem de Bruijn wrote: > > > On Wed, Jun 5, 2024 at 3:45 PM Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote: > > > > > > > > On Wed, Jun 05, 2024 at 09:08:33PM +0200, Florian Westphal wrote: > > > > > Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote: > > > > > > > > > > [ CC Willem ] > > > > > > > > > > > On Wed, Jun 05, 2024 at 08:14:50PM +0200, Florian Westphal wrote: > > > > > > > Christoph Paasch <cpaasch@xxxxxxxxx> wrote: > > > > > > > > > Reported-by: Christoph Paasch <cpaasch@xxxxxxxxx> > > > > > > > > > Suggested-by: Paolo Abeni <pabeni@xxxxxxxxxx> > > > > > > > > > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/494 > > > > > > > > > Signed-off-by: Florian Westphal <fw@xxxxxxxxx> > > > > > > > > > > > > > > > > I just gave this one a shot in my syzkaller instances and am still hitting the issue. > > > > > > > > > > > > > > No, different bug, this patch is correct. > > > > > > > > > > > > > > I refuse to touch the flow dissector. > > > > > > > > > > > > I see callers of ip_local_out() in the tree which do not set skb->dev. > > > > > > > > > > > > I don't understand this: > > > > > > > > > > > > bool __skb_flow_dissect(const struct net *net, > > > > > > const struct sk_buff *skb, > > > > > > struct flow_dissector *flow_dissector, > > > > > > void *target_container, const void *data, > > > > > > __be16 proto, int nhoff, int hlen, unsigned int flags) > > > > > > { > > > > > > [...] > > > > > > WARN_ON_ONCE(!net); > > > > > > if (net) { > > > > > > > > > > > > it was added by 9b52e3f267a6 ("flow_dissector: handle no-skb use case") > > > > > > > > > > > > Is this WARN_ON_ONCE() bogus? > > > > > > > > > > When this was added (handle dissection from bpf prog, per netns), the correct > > > > > solution would have been to pass 'struct net' explicitly via skb_get_hash() > > > > > and all variants. As that was likely deemed to be too much code churn it > > > > > tries to infer struct net via skb->{dev,sk}. > > > > > > It has been a while, but I think we just did not anticipate skb's with > > > neither dev nor sk set. > > > > > > Digging through the layers from skb_hash to __skb_flow_dissect > > > now, it does look impractical to add such an explicit API. > > > > > > > > So there are several options here: > > > > > 1. remove the WARN_ON_ONCE and be done with it > > > > > 2. remove the WARN_ON_ONCE and pretend net was init_net > > > > > 3. also look at skb_dst(skb)->dev if skb->dev is unset, then back to 1) > > > > > or 2) > > > > > 4. stop using skb_get_hash() from netfilter (but there are likely other > > > > > callers that might hit this). > > > > > 5. fix up callers, one by one > > > > > 6. assign skb->dev inside netfilter if its unset > > > > > > Is 6 a realistic option? > > > > Postrouting path already sets on skb->dev via ip_output(), if callers > > are "fixed" then skb->dev will get set twice. > > I meant to set it just before calling skb_get_hash and unset > right after. Just using the skb to piggy back the information. > > > the netfilter tracing infrastructure only needs a hash identifier for > > this packet to be displayed from userspace when tracing rules, how is > > the running the custom bpf dissector hook useful in such case? > > The BPF flow dissector is there as much to short circuit the > hard-coded C protocol dissectors as to expand on the existing > feature set. I did not want production machines exposed to > every protocol parser, as that set kept expanding. > > Having different dissection algorithms depending on where the > packet enters the dissector is also surprising behavior? I would say: "having different dissection _operation modes_ depending on where the packets enters the dissector is the expected behaviour", so the dissector code adapts to the use-case. > If all that is needed is an opaque ID, and on egress skb->hash > is derived with skb_set_hash_from_sk from sk_txhash, then > this can even be pseudo-random from net_tx_rndhash(). I think this will not work for the netfilter use-case, such pseudo-random number would be then generated in the scope of the chain and that packet could go over several chains while being processed. Thus, displaying different IDs for the same packet. > > the most correct solution is to pass struct net explicitly to the > > dissector API instead of guessing what net this packet belongs to. > > Unfortunately from skb_get_hash to __skb_flow_dissect is four > layers of indirections, including one with three underscores already, > that cannot be easily circumvented. > > Temporarily passing it through skb (and unsetting after) seems > the simplest fix to me. it sounds like a temporary workaround. I understand your concern that it requires a much larger patch to pass net to the flow dissector infrastructure. > > Else, remove this WARN_ON_ONCE which is just a oneliner. > > According to the commit that introduced it, this was to sniff out drivers > that don't set this (via eth_get_headlen). > > The problem is that nothing else warns loudly, so you just quietly > lose the BPF dissection coverage. > > This is the first time it warned. You point out that the value of BPF > is limited to you in this case. But that's not true for all such cases. > I'd rather dissection breaks hard than that it falls back quietly for > input from an untrusted network. > > Maybe reduce it to DEBUG_NET_WARN_ON_ONCE? I am afraid syzkaller and such will still hit this for Christoph, since DEBUG_NET is usually recommended to be turned on in such case. Another possibility is that netfilter stops using the dissector code for the tracing infrastructure at the cost of replicating code. Thanks.