On Mon, 2 Nov 2020 21:10:34 +0100 Jesper Dangaard Brouer <brouer@xxxxxxxxxx> wrote: > On Mon, 02 Nov 2020 10:04:44 -0800 > John Fastabend <john.fastabend@xxxxxxxxx> wrote: > > > > > > + > > > > > + /* Same relax as xdp_ok_fwd_dev() and is_skb_forwardable() */ > > > > > + if (flags & BPF_MTU_CHK_RELAX) > > > > > + mtu += VLAN_HLEN; > > > > > > > > I'm trying to think about the use case where this might be used? > > > > Compared to just adjusting MTU in BPF program side as needed for > > > > packet encapsulation/headers/etc. > > > > > > As I wrote above, this were added because the kernels own forwarding > > > have this relaxation in it's checks (in is_skb_forwardable()). I even > > > tried to dig through the history, introduced in [1] and copy-pasted > > > in[2]. And this seems to be a workaround, that have become standard, > > > that still have practical implications. > > > > > > My practical experiments showed, that e.g. ixgbe driver with MTU=1500 > > > (L3-size) will allow and fully send packets with 1504 (L3-size). But > > > i40e will not, and drops the packet in hardware/firmware step. So, > > > what is the correct action, strict or relaxed? > > > > > > My own conclusion is that we should inverse the flag. Meaning to > > > default add this VLAN_HLEN (4 bytes) relaxation, and have a flag to do > > > more strict check, e.g. BPF_MTU_CHK_STRICT. As for historical reasons > > > we must act like kernels version of MTU check. Unless you object, I will > > > do this in V6. > > > > I'm fine with it either way as long as its documented in the helper > > description so I have a chance of remembering this discussion in 6 months. > > But, if you make it default won't this break for XDP cases? I assume the > > XDP use case doesn't include the VLAN 4-bytes. Would you need to prevent > > the flag from being used from XDP? > > XDP actually do include the VLAN_HLEN 4-bytes, see xdp_ok_fwd_dev(). I > was so certain that you John added this code, but looking through git > blame it pointed back to myself. Going 5 levels git history deep and > 3+ years, does seem like I move/reused some of Johns code containing > VLAN_HLEN in the MTU check, introduced for xdp-generic (6103aa96ec077) > which I acked. Thus, I guess I cannot push this away and have to take > blame myself ;-) > > I conclude that we default need to include this VLAN_HLEN, else the XDP > bpf_check_mtu could say deny, while it would have passed the check in > xdp_ok_fwd_dev(). As i40e will drop 1504 this at HW/FW level, I still > see a need for a BPF_MTU_CHK_STRICT flag for programs that want to > catch this. Disagreeing with myself... I want to keep the BPF_MTU_CHK_RELAX, and let MTU check use the actual MTU value (adjusted to L2 of-cause). With the argument, that because some drivers with MTU 1500 will actually drop frame with MTU 1504 bytes (+14 eth_hdr) frames, it is wrong to "approve" this MTU size in the check. A BPF program will know it is playing with VLAN headers and can choose to violate the MTU check with 4 bytes. While BPF programs using other types of encap headers will get confused that MTU check gives them 4 bytes more, which if used will get dropped on a subset of drivers. -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer