On 1/14/21 3:36 PM, Jesper Dangaard Brouer wrote:
[...]
+BPF_CALL_5(bpf_skb_check_mtu, struct sk_buff *, skb,
+ u32, ifindex, u32 *, mtu_len, s32, len_diff, u64, flags)
+{
+ int ret = BPF_MTU_CHK_RET_FRAG_NEEDED;
+ struct net_device *dev = skb->dev;
+ int skb_len, dev_len;
+ int mtu;
+
+ if (unlikely(flags & ~(BPF_MTU_CHK_SEGS)))
+ return -EINVAL;
+
+ dev = __dev_via_ifindex(dev, ifindex);
+ if (unlikely(!dev))
+ return -ENODEV;
+
+ mtu = READ_ONCE(dev->mtu);
+
+ dev_len = mtu + dev->hard_header_len;
+ skb_len = skb->len + len_diff; /* minus result pass check */
+ if (skb_len <= dev_len) {
+ ret = BPF_MTU_CHK_RET_SUCCESS;
+ goto out;
+ }
+ /* At this point, skb->len exceed MTU, but as it include length of all
+ * segments, it can still be below MTU. The SKB can possibly get
+ * re-segmented in transmit path (see validate_xmit_skb). Thus, user
+ * must choose if segs are to be MTU checked. Last SKB "headlen" is
+ * checked against MTU.
+ */
+ if (skb_is_gso(skb)) {
+ ret = BPF_MTU_CHK_RET_SUCCESS;
+
+ if (!(flags & BPF_MTU_CHK_SEGS))
+ goto out;
+
+ if (!skb_gso_validate_network_len(skb, mtu)) {
+ ret = BPF_MTU_CHK_RET_SEGS_TOOBIG;
+ goto out;
+ }
+
+ skb_len = skb_headlen(skb) + len_diff;
+ if (skb_len > dev_len) {
[...]
Do you have a particular use case for the BPF_MTU_CHK_SEGS?
The complaint from Maze (and others) were that when skb_is_gso then all
the MTU checks are bypassed. This flag enables checking the GSO part
via skb_gso_validate_network_len(). We cannot enable it per default,
as you say, it is universally correct in all cases.
If there is a desire to have access to the skb_gso_validate_network_len(), I'd
keep that behind the flag then, but would drop the skb_headlen(skb) + len_diff
case given the mentioned case on rx where it would yield misleading results to
users that might be unintuitive & hard to debug.
I also don't see the flag being used anywhere in your selftests, so I presume
not as otherwise you would have added an example there?
I'm using the flag in the bpf-examples code[1], this is how I've tested
the code path.
I've not found a way to generate GSO packet via the selftests
infrastructure via bpf_prog_test_run_xattr(). I'm
[1] https://github.com/xdp-project/bpf-examples/blob/master/MTU-tests/tc_mtu_enforce.c
Haven't checked but likely something as prog_tests/skb_ctx.c might not be sufficient
to pass it into the helper. For real case you might need a netns + veth setup like
some of the other tests are doing and then generating TCP stream from one end to the
other.