On Tue, 6 Oct 2020 18:34:50 -0700 Maciej Żenczykowski <maze@xxxxxxxxxx> wrote: > On Tue, Oct 6, 2020 at 9:03 AM Jesper Dangaard Brouer <brouer@xxxxxxxxxx> wrote: > > > > The BPF-helpers for FIB lookup (bpf_xdp_fib_lookup and bpf_skb_fib_lookup) > > can perform MTU check and return BPF_FIB_LKUP_RET_FRAG_NEEDED. The BPF-prog > > don't know the MTU value that caused this rejection. > > > > If the BPF-prog wants to implement PMTU (Path MTU Discovery) (rfc1191) it > > need to know this MTU value for the ICMP packet. > > > > Patch change lookup and result struct bpf_fib_lookup, to contain this MTU > > value as output via a union with 'tot_len' as this is the value used for > > the MTU lookup. > > > > Signed-off-by: Jesper Dangaard Brouer <brouer@xxxxxxxxxx> > > --- > > include/uapi/linux/bpf.h | 11 +++++++++-- > > net/core/filter.c | 17 ++++++++++++----- > > 2 files changed, 21 insertions(+), 7 deletions(-) > > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > > index c446394135be..50ce65e37b16 100644 > > --- a/include/uapi/linux/bpf.h > > +++ b/include/uapi/linux/bpf.h [...] > > @@ -4844,9 +4847,13 @@ struct bpf_fib_lookup { > > __be16 sport; > > __be16 dport; > > > > - /* total length of packet from network header - used for MTU check */ > > - __u16 tot_len; > > + union { /* used for MTU check */ > > + /* input to lookup */ > > + __u16 tot_len; /* total length of packet from network hdr */ > > > > + /* output: MTU value (if requested check_mtu) */ > > + __u16 mtu; > > + }; > > /* input: L3 device index for lookup > > * output: device index from FIB lookup > > */ [...] > > It would be beneficial to be able to fetch the route advmss, initcwnd, > etc as well... > But I take it the struct can't be extended? The struct bpf_fib_lookup is exactly 1 cache-line (64 bytes) for performance reasons. I do believe that it can be extended, as Ahern designed the BPF-helper API cleverly via a plen (detail below signature). For accessing other route metric information like advmss and initcwnd, I would expect Daniel to suggest to use BTF to access info from dst_entry, or actually dst->_metrics. But looking at the details for accessing dst->_metrics is complicated by macros, thus I expect BTF would have a hard time. -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer int bpf_fib_lookup(void *ctx, struct bpf_fib_lookup *params, int plen, u32 flags) $ pahole -C bpf_fib_lookup struct bpf_fib_lookup { __u8 family; /* 0 1 */ __u8 l4_protocol; /* 1 1 */ __be16 sport; /* 2 2 */ __be16 dport; /* 4 2 */ union { __u16 tot_len; /* 6 2 */ __u16 mtu; /* 6 2 */ }; /* 6 2 */ __u32 ifindex; /* 8 4 */ union { __u8 tos; /* 12 1 */ __be32 flowinfo; /* 12 4 */ __u32 rt_metric; /* 12 4 */ }; /* 12 4 */ union { __be32 ipv4_src; /* 16 4 */ __u32 ipv6_src[4]; /* 16 16 */ }; /* 16 16 */ union { __be32 ipv4_dst; /* 32 4 */ __u32 ipv6_dst[4]; /* 32 16 */ }; /* 32 16 */ __be16 h_vlan_proto; /* 48 2 */ __be16 h_vlan_TCI; /* 50 2 */ __u8 smac[6]; /* 52 6 */ __u8 dmac[6]; /* 58 6 */ /* size: 64, cachelines: 1, members: 13 */ }; struct dst_metrics { u32 metrics[RTAX_MAX]; refcount_t refcnt; } __aligned(4); /* Low pointer bits contain DST_METRICS_FLAGS */ extern const struct dst_metrics dst_default_metrics; #define DST_METRICS_READ_ONLY 0x1UL #define DST_METRICS_REFCOUNTED 0x2UL #define DST_METRICS_FLAGS 0x3UL #define __DST_METRICS_PTR(Y) \ ((u32 *)((Y) & ~DST_METRICS_FLAGS)) #define DST_METRICS_PTR(X) __DST_METRICS_PTR((X)->_metrics) static inline u32 dst_metric_raw(const struct dst_entry *dst, const int metric) { u32 *p = DST_METRICS_PTR(dst); return p[metric-1]; } static inline u32 dst_metric_advmss(const struct dst_entry *dst) { u32 advmss = dst_metric_raw(dst, RTAX_ADVMSS); if (!advmss) advmss = dst->ops->default_advmss(dst); return advmss; }