RE: [PATCH bpf-next V5 3/5] bpf: add BPF-helper for MTU checking

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jesper Dangaard Brouer wrote:
> This BPF-helper bpf_check_mtu() works for both XDP and TC-BPF programs.
> 
> The API is designed to help the BPF-programmer, that want to do packet
> context size changes, which involves other helpers. These other helpers
> usually does a delta size adjustment. This helper also support a delta
> size (len_diff), which allow BPF-programmer to reuse arguments needed by
> these other helpers, and perform the MTU check prior to doing any actual
> size adjustment of the packet context.
> 
> It is on purpose, that we allow the len adjustment to become a negative
> result, that will pass the MTU check. This might seem weird, but it's not
> this helpers responsibility to "catch" wrong len_diff adjustments. Other
> helpers will take care of these checks, if BPF-programmer chooses to do
> actual size adjustment.
> 
> V4: Lot of changes
>  - ifindex 0 now use current netdev for MTU lookup
>  - rename helper from bpf_mtu_check to bpf_check_mtu
>  - fix bug for GSO pkt length (as skb->len is total len)
>  - remove __bpf_len_adj_positive, simply allow negative len adj
> 
> V3: Take L2/ETH_HLEN header size into account and document it.
> 
> Signed-off-by: Jesper Dangaard Brouer <brouer@xxxxxxxxxx>
> ---

Sorry for the late feedback here.

This seems like a lot of baked in functionality into the helper? Can you
say something about why the simpler and, at least to me, more intuitive
helper to simply return the ifindex mtu is not ideal?

Rough pseudo code being,

 my_sender(struct __sk_buff *skb, int fwd_ifindex)
 {
   mtu = bpf_get_ifindex_mtu(fwd_ifindex, 0);
   if (skb->len + HDR_SIZE < mtu)
       return send_with_hdrs(skb);
   return -EMSGSIZE
 }


>  include/uapi/linux/bpf.h       |   70 +++++++++++++++++++++++
>  net/core/filter.c              |  120 ++++++++++++++++++++++++++++++++++++++++
>  tools/include/uapi/linux/bpf.h |   70 +++++++++++++++++++++++
>  3 files changed, 260 insertions(+)
> 

[...]

> + *              **BPF_MTU_CHK_RELAX**
> + *			This flag relax or increase the MTU with room for one
> + *			VLAN header (4 bytes). This relaxation is also used by
> + *			the kernels own forwarding MTU checks.

I noted below as well, but not sure why this is needed. Seems if user
knows to add a flag because they want a vlan header we can just as
easily expect BPF program to do it. Also it only works for VLAN headers
any other header data wont be accounted for so it seems only useful
in one specific case.

> + *
> + *		**BPF_MTU_CHK_SEGS**
> + *			This flag will only works for *ctx* **struct sk_buff**.
> + *			If packet context contains extra packet segment buffers
> + *			(often knows as GSO skb), then MTU check is partly
> + *			skipped, because in transmit path it is possible for the
> + *			skb packet to get re-segmented (depending on net device
> + *			features).  This could still be a MTU violation, so this
> + *			flag enables performing MTU check against segments, with
> + *			a different violation return code to tell it apart.
> + *
> + *		The *mtu_result* pointer contains the MTU value of the net
> + *		device including the L2 header size (usually 14 bytes Ethernet
> + *		header). The net device configured MTU is the L3 size, but as
> + *		XDP and TX length operate at L2 this helper include L2 header
> + *		size in reported MTU.
> + *
> + *	Return
> + *		* 0 on success, and populate MTU value in *mtu_result* pointer.
> + *
> + *		* < 0 if any input argument is invalid (*mtu_result* not updated)
> + *
> + *		MTU violations return positive values, but also populate MTU
> + *		value in *mtu_result* pointer, as this can be needed for
> + *		implementing PMTU handing:
> + *
> + *		* **BPF_MTU_CHK_RET_FRAG_NEEDED**
> + *		* **BPF_MTU_CHK_RET_SEGS_TOOBIG**
> + *
>   */

[...]

> +static int __bpf_lookup_mtu(struct net_device *dev_curr, u32 ifindex, u64 flags)
> +{
> +	struct net *netns = dev_net(dev_curr);
> +	struct net_device *dev;
> +	int mtu;
> +
> +	/* Non-redirect use-cases can use ifindex=0 and save ifindex lookup */
> +	if (ifindex == 0)
> +		dev = dev_curr;
> +	else
> +		dev = dev_get_by_index_rcu(netns, ifindex);
> +
> +	if (!dev)
> +		return -ENODEV;
> +
> +	/* XDP+TC len is L2: Add L2-header as dev MTU is L3 size */
> +	mtu = dev->mtu + dev->hard_header_len;

READ_ONCE() on dev->mtu and hard_header_len as well? We don't have
any locks.

> +
> +	/*  Same relax as xdp_ok_fwd_dev() and is_skb_forwardable() */
> +	if (flags & BPF_MTU_CHK_RELAX)
> +		mtu += VLAN_HLEN;

I'm trying to think about the use case where this might be used?
Compared to just adjusting MTU in BPF program side as needed for
packet encapsulation/headers/etc.

> +
> +	return mtu;
> +}
> +



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux