Re: [PATCH bpf-next V2 5/6] bpf: Add MTU check for TC-BPF packets after egress hook

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/7/20 6:23 PM, Jesper Dangaard Brouer wrote:
[...]
  net/core/dev.c |   24 ++++++++++++++++++++++--
  1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index b433098896b2..19406013f93e 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3870,6 +3870,7 @@ sch_handle_egress(struct sk_buff *skb, int *ret, struct net_device *dev)
  	switch (tcf_classify(skb, miniq->filter_list, &cl_res, false)) {
  	case TC_ACT_OK:
  	case TC_ACT_RECLASSIFY:
+		*ret = NET_XMIT_SUCCESS;
  		skb->tc_index = TC_H_MIN(cl_res.classid);
  		break;
  	case TC_ACT_SHOT:
@@ -4064,9 +4065,12 @@ static int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev)
  {
  	struct net_device *dev = skb->dev;
  	struct netdev_queue *txq;
+#ifdef CONFIG_NET_CLS_ACT
+	bool mtu_check = false;
+#endif
+	bool again = false;
  	struct Qdisc *q;
  	int rc = -ENOMEM;
-	bool again = false;
skb_reset_mac_header(skb); @@ -4082,14 +4086,28 @@ static int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev) qdisc_pkt_len_init(skb);
  #ifdef CONFIG_NET_CLS_ACT
+	mtu_check = skb_is_redirected(skb);
  	skb->tc_at_ingress = 0;
  # ifdef CONFIG_NET_EGRESS
  	if (static_branch_unlikely(&egress_needed_key)) {
+		unsigned int len_orig = skb->len;
+
  		skb = sch_handle_egress(skb, &rc, dev);
  		if (!skb)
  			goto out;
+		/* BPF-prog ran and could have changed packet size beyond MTU */
+		if (rc == NET_XMIT_SUCCESS && skb->len > len_orig)
+			mtu_check = true;
  	}
  # endif
+	/* MTU-check only happens on "last" net_device in a redirect sequence
+	 * (e.g. above sch_handle_egress can steal SKB and skb_do_redirect it
+	 * either ingress or egress to another device).
+	 */

Hmm, quite some overhead in fast path. Also, won't this be checked multiple times
on stacked devices? :( Moreover, this missed the fact that 'real' qdiscs can have
filters attached too, this would come after this check. Can't this instead be in
driver layer for those that really need it? I would probably only drop the check
as done in 1/6 and allow the BPF prog to do the validation if needed.

+	if (mtu_check && !is_skb_forwardable(dev, skb)) {
+		rc = -EMSGSIZE;
+		goto drop;
+	}
  #endif
  	/* If device/qdisc don't need skb->dst, release it right now while
  	 * its hot in this cpu cache.
@@ -4157,7 +4175,9 @@ static int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev)
rc = -ENETDOWN;
  	rcu_read_unlock_bh();
-
+#ifdef CONFIG_NET_CLS_ACT
+drop:
+#endif
  	atomic_long_inc(&dev->tx_dropped);
  	kfree_skb_list(skb);
  	return rc;




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux