On Wed, 07 Oct 2020 20:19:39 -0700 John Fastabend <john.fastabend@xxxxxxxxx> wrote: > Maciej Żenczykowski wrote: > > On Wed, Oct 7, 2020 at 3:37 PM John Fastabend <john.fastabend@xxxxxxxxx> wrote: > > > > > > Daniel Borkmann wrote: > > > > On 10/7/20 6:23 PM, Jesper Dangaard Brouer wrote: > > > > [...] > > > > > net/core/dev.c | 24 ++++++++++++++++++++++-- > > > > > 1 file changed, 22 insertions(+), 2 deletions(-) > > > > > > Couple high-level comments. Whats the problem with just letting the driver > > > consume the packet? I would chalk it up to a buggy BPF program that is > > > sending these packets. The drivers really shouldn't panic or do anything > > > horrible under this case because even today I don't think we can be > > > 100% certain MTU on skb matches set MTU. Imagine the case where I change > > > the MTU from 9kB->1500B there will be some skbs in-flight with the larger > > > length and some with the shorter. If the drivers panic/fault or otherwise > > > does something else horrible thats not going to be friendly in general case > > > regardless of what BPF does. And seeing this type of config is all done > > > async its tricky (not practical) to flush any skbs in-flight. > > > > > > I've spent many hours debugging these types of feature flag, mtu > > > change bugs on the driver side I'm not sure it can be resolved by > > > the stack easily. Better to just build drivers that can handle it IMO. > > > > > > Do we know if sending >MTU size skbs to drivers causes problems in real > > > cases? I haven't tried on the NICs I have here, but I expect they should > > > be fine. Fine here being system keeps running as expected. Dropping the > > > skb either on TX or RX side is expected. Even with this change though > > > its possible for the skb to slip through if I configure MTU on a live > > > system. > > > > I wholeheartedly agree with the above. > > > > Ideally the only >mtu check should happen at driver admittance. > > But again ideally it should happen in some core stack location not in > > the driver itself. > > Ideally maybe, but IMO we should just let the skb go to the driver > and let the driver sort it out. Even if this means pushing the packet > onto the wire then the switch will drop it or the receiver, etc. A > BPF program can do lots of horrible things that should never be > on the wire otherwise. MTU is just one of them, but sending corrupted > payloads, adding bogus headers, checksums etc. so I don't think we can > reasonable protect against all of them. That is a good point. > Of course if the driver is going to hang/panic then something needs > to be done. Perhaps a needs_mtu_check feature flag, although thats > not so nice either so perhaps drivers just need to handle it themselves. > Also even today the case could happen without BPF as best I can tell > so the drivers should be prepared for it. Yes, borderline cases do exist already today (like your reconf with inflight packets), so I guess drivers should at-least not hang/panic and be robust enough that we can drop this check. I think you have convinced me that we can drop this check. My only concern is how people can troubleshoot this, but its not different from current state. > > However, due to both gso and vlan offload, even this is not trivial to do... > > The mtu is L3, but drivers/hardware/the wire usually care about L2... If net_device->mtu is L3 (1500) and XDP (and TC, right?) operate at L2, that likely means that the "strict" bpf_mtu_check (in my BPF-helper) is wrong, as XDP (and TC) length at this point include the 14 bytes Ethernet header. I will check and fix. Is this accounted for via net_device->hard_header_len ? -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer