Re: [PATCH net-next v2 1/2] net, sched: Make tc-related drop reason more flexible

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/25/23 1:05 PM, Jamal Hadi Salim wrote:
On Wed, Oct 25, 2023 at 6:01 AM Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote:
On 10/25/23 10:59 AM, Ido Schimmel wrote:
On Mon, Oct 09, 2023 at 11:26:54AM +0200, Daniel Borkmann wrote:
diff --git a/net/core/dev.c b/net/core/dev.c
index 606a366cc209..664426285fa3 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3910,7 +3910,8 @@ EXPORT_SYMBOL_GPL(netdev_xmit_skip_txqueue);
   #endif /* CONFIG_NET_EGRESS */

   #ifdef CONFIG_NET_XGRESS
-static int tc_run(struct tcx_entry *entry, struct sk_buff *skb)
+static int tc_run(struct tcx_entry *entry, struct sk_buff *skb,
+              enum skb_drop_reason *drop_reason)
   {
      int ret = TC_ACT_UNSPEC;
   #ifdef CONFIG_NET_CLS_ACT
@@ -3922,12 +3923,14 @@ static int tc_run(struct tcx_entry *entry, struct sk_buff *skb)

      tc_skb_cb(skb)->mru = 0;
      tc_skb_cb(skb)->post_ct = false;
+    res.drop_reason = *drop_reason;

      mini_qdisc_bstats_cpu_update(miniq, skb);
      ret = tcf_classify(skb, miniq->block, miniq->filter_list, &res, false);
      /* Only tcf related quirks below. */
      switch (ret) {
      case TC_ACT_SHOT:
+            *drop_reason = res.drop_reason;

Daniel,

Getting the following splat [1] with CONFIG_DEBUG_NET=y and this
reproducer [2]. Problem seems to be that classifiers clear 'struct
tcf_result::drop_reason', thereby triggering the warning in
__kfree_skb_reason() due to reason being 'SKB_NOT_DROPPED_YET' (0).

Fixed by maintaining the original drop reason if the one returned from
tcf_classify() is 'SKB_NOT_DROPPED_YET' [3]. I can submit this fix
unless you have a better idea.

Thanks for catching this, looks reasonable to me as a fix.

[1]
WARNING: CPU: 0 PID: 181 at net/core/skbuff.c:1082 kfree_skb_reason+0x38/0x130
Modules linked in:
CPU: 0 PID: 181 Comm: mausezahn Not tainted 6.6.0-rc6-custom-ge43e6d9582e0 #682
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014
RIP: 0010:kfree_skb_reason+0x38/0x130
[...]
Call Trace:
   <IRQ>
   __netif_receive_skb_core.constprop.0+0x837/0xdb0
   __netif_receive_skb_one_core+0x3c/0x70
   process_backlog+0x95/0x130
   __napi_poll+0x25/0x1b0
   net_rx_action+0x29b/0x310
   __do_softirq+0xc0/0x29b
   do_softirq+0x43/0x60
   </IRQ>

[2]
#!/bin/bash

ip link add name veth0 type veth peer name veth1
ip link set dev veth0 up
ip link set dev veth1 up
tc qdisc add dev veth1 clsact
tc filter add dev veth1 ingress pref 1 proto all flower dst_mac 00:11:22:33:44:55 action drop
mausezahn veth0 -a own -b 00:11:22:33:44:55 -q -c 1

I didn't know you're using mausezahn, nice :)

[3]
diff --git a/net/core/dev.c b/net/core/dev.c
index a37a932a3e14..abd0b13f3f17 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3929,7 +3929,8 @@ static int tc_run(struct tcx_entry *entry, struct sk_buff *skb,
          /* Only tcf related quirks below. */
          switch (ret) {
          case TC_ACT_SHOT:
-               *drop_reason = res.drop_reason;
+               if (res.drop_reason != SKB_NOT_DROPPED_YET)
+                       *drop_reason = res.drop_reason;
                  mini_qdisc_qstats_cpu_drop(miniq);
                  break;
          case TC_ACT_OK:


Out of curiosity - how does the policy say "drop" but drop_reason does
not reflect it?

Ido, Jamal, wdyt about this alternative approach - these were the locations I could
find from an initial glance (compile-tested) :

From a3d46a55aac484372b60b783cb6a3c98a0fef75c Mon Sep 17 00:00:00 2001
From: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
Date: Wed, 25 Oct 2023 11:43:44 +0000
Subject: [PATCH] net, sched: fix..

Signed-off-by: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
---
 include/net/pkt_cls.h    | 12 ++++++++++++
 net/sched/cls_basic.c    |  2 +-
 net/sched/cls_bpf.c      |  2 +-
 net/sched/cls_flower.c   |  2 +-
 net/sched/cls_fw.c       |  2 +-
 net/sched/cls_matchall.c |  2 +-
 net/sched/cls_route.c    |  4 ++--
 net/sched/cls_u32.c      |  2 +-
 8 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index a76c9171db0e..31d8e8587824 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -160,6 +160,18 @@ static inline void tcf_set_drop_reason(struct tcf_result *res,
 	res->drop_reason = reason;
 }

+static inline void tcf_set_result(struct tcf_result *to,
+				  const struct tcf_result *from)
+{
+	/* tcf_result's drop_reason which is the last member must be
+	 * preserved and cannot be copied from the cls'es tcf_result
+	 * template given this is carried all the way and potentially
+	 * set to a concrete tc drop reason upon error or intentional
+	 * drop. See tcf_set_drop_reason() locations.
+	 */
+	memcpy(to, from, offsetof(typeof(*to), drop_reason));
+}
+
 static inline void
 __tcf_bind_filter(struct Qdisc *q, struct tcf_result *r, unsigned long base)
 {
diff --git a/net/sched/cls_basic.c b/net/sched/cls_basic.c
index 1b92c33b5f81..d7ead3fc3c45 100644
--- a/net/sched/cls_basic.c
+++ b/net/sched/cls_basic.c
@@ -50,7 +50,7 @@ TC_INDIRECT_SCOPE int basic_classify(struct sk_buff *skb,
 		if (!tcf_em_tree_match(skb, &f->ematches, NULL))
 			continue;
 		__this_cpu_inc(f->pf->rhit);
-		*res = f->res;
+		tcf_set_result(res, &f->res);
 		r = tcf_exts_exec(skb, &f->exts, res);
 		if (r < 0)
 			continue;
diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c
index 382c7a71f81f..e4620a462bc3 100644
--- a/net/sched/cls_bpf.c
+++ b/net/sched/cls_bpf.c
@@ -124,7 +124,7 @@ TC_INDIRECT_SCOPE int cls_bpf_classify(struct sk_buff *skb,
 			res->class   = 0;
 			res->classid = filter_res;
 		} else {
-			*res = prog->res;
+			tcf_set_result(res, &prog->res);
 		}

 		ret = tcf_exts_exec(skb, &prog->exts, res);
diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index e5314a31f75a..eb94090fb26c 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -341,7 +341,7 @@ TC_INDIRECT_SCOPE int fl_classify(struct sk_buff *skb,

 		f = fl_mask_lookup(mask, &skb_key);
 		if (f && !tc_skip_sw(f->flags)) {
-			*res = f->res;
+			tcf_set_result(res, &f->res);
 			return tcf_exts_exec(skb, &f->exts, res);
 		}
 	}
diff --git a/net/sched/cls_fw.c b/net/sched/cls_fw.c
index c49d6af0e048..70b873f8771f 100644
--- a/net/sched/cls_fw.c
+++ b/net/sched/cls_fw.c
@@ -63,7 +63,7 @@ TC_INDIRECT_SCOPE int fw_classify(struct sk_buff *skb,
 		for (f = rcu_dereference_bh(head->ht[fw_hash(id)]); f;
 		     f = rcu_dereference_bh(f->next)) {
 			if (f->id == id) {
-				*res = f->res;
+				tcf_set_result(res, &f->res);
 				if (!tcf_match_indev(skb, f->ifindex))
 					continue;
 				r = tcf_exts_exec(skb, &f->exts, res);
diff --git a/net/sched/cls_matchall.c b/net/sched/cls_matchall.c
index c4ed11df6254..a4018db80a60 100644
--- a/net/sched/cls_matchall.c
+++ b/net/sched/cls_matchall.c
@@ -37,7 +37,7 @@ TC_INDIRECT_SCOPE int mall_classify(struct sk_buff *skb,
 	if (tc_skip_sw(head->flags))
 		return -1;

-	*res = head->res;
+	tcf_set_result(res, &head->res);
 	__this_cpu_inc(head->pf->rhit);
 	return tcf_exts_exec(skb, &head->exts, res);
 }
diff --git a/net/sched/cls_route.c b/net/sched/cls_route.c
index 1424bfeaca73..cbfaa1d1820f 100644
--- a/net/sched/cls_route.c
+++ b/net/sched/cls_route.c
@@ -109,7 +109,7 @@ static inline int route4_hash_wild(void)

 #define ROUTE4_APPLY_RESULT()					\
 {								\
-	*res = f->res;						\
+	tcf_set_result(res, &f->res);				\
 	if (tcf_exts_has_actions(&f->exts)) {			\
 		int r = tcf_exts_exec(skb, &f->exts, res);	\
 		if (r < 0) {					\
@@ -152,7 +152,7 @@ TC_INDIRECT_SCOPE int route4_classify(struct sk_buff *skb,
 			goto failure;
 		}

-		*res = f->res;
+		tcf_set_result(res, &f->res);
 		spin_unlock(&fastmap_lock);
 		return 0;
 	}
diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
index 6663e971a13e..f50ae40a29d5 100644
--- a/net/sched/cls_u32.c
+++ b/net/sched/cls_u32.c
@@ -172,7 +172,7 @@ TC_INDIRECT_SCOPE int u32_classify(struct sk_buff *skb,
 check_terminal:
 			if (n->sel.flags & TC_U32_TERMINAL) {

-				*res = n->res;
+				tcf_set_result(res, &n->res);
 				if (!tcf_match_indev(skb, n->ifindex)) {
 					n = rcu_dereference_bh(n->next);
 					goto next_knode;
--
2.34.1





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux