On Sat, 10 Oct 2020 09:32:12 -0700 Jakub Kicinski <kuba@xxxxxxxxxx> wrote: > On Sat, 10 Oct 2020 12:44:02 +0200 Jesper Dangaard Brouer wrote: > > > > > We will not be sprinkling validation checks across the drivers because > > > > > some reconfiguration path may occasionally yield a bad packet, or it's > > > > > hard to do something right with BPF. > > > > > > > > This is a driver bug then. As it stands today drivers may get hit with > > > > skb with MTU greater than set MTU as best I can tell. > > > > > > You're talking about taking it from "maybe this can happen, but will > > > still be at most jumbo" to "it's going to be very easy to trigger and > > > length may be > MAX_U16". > > > > It is interesting that a misbehaving BPF program can easily trigger this. > > Next week, I will looking writing such a BPF-prog and then test it on > > the hardware I have avail in my testlab. I've tested sending different packet sizes that exceed the MTU on different hardware. They all silently drop the transmitted packet. mlx5 and i40e configured to (L3) MTU 1500, will lets through upto 1504, while ixgbe will drop size 1504. Packets can be observed locally with tcpdump, but the other end doesn't receive the packet. I didn't find any counters (including ethtool -S) indicating these packets were dropped at hardware/firmware level, which were a little concerning for later troubleshooting. Another observation is that size increases (with bpf_skb_adjust_room) above 4096 + e.g 128 will likely fail, even-though I have the 64K limit in this kernel. > FWIW I took a quick swing at testing it with the HW I have and it did > exactly what hardware should do. The TX unit entered an error state > and then the driver detected that and reset it a few seconds later. The drivers (i40e, mlx5, ixgbe) I tested with didn't entered an error state, when getting packets exceeding the MTU. I didn't go much above 4K, so maybe I didn't trigger those cases. > Hardware is almost always designed to behave like that. If some NIC > actually cleanly drops over sized TX frames, I'd bet it's done in FW, > or some other software piece. -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer