On Wed, Oct 25, 2023 at 6:01 AM Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote: > > Hi Ido, > > On 10/25/23 10:59 AM, Ido Schimmel wrote: > > On Mon, Oct 09, 2023 at 11:26:54AM +0200, Daniel Borkmann wrote: > >> diff --git a/net/core/dev.c b/net/core/dev.c > >> index 606a366cc209..664426285fa3 100644 > >> --- a/net/core/dev.c > >> +++ b/net/core/dev.c > >> @@ -3910,7 +3910,8 @@ EXPORT_SYMBOL_GPL(netdev_xmit_skip_txqueue); > >> #endif /* CONFIG_NET_EGRESS */ > >> > >> #ifdef CONFIG_NET_XGRESS > >> -static int tc_run(struct tcx_entry *entry, struct sk_buff *skb) > >> +static int tc_run(struct tcx_entry *entry, struct sk_buff *skb, > >> + enum skb_drop_reason *drop_reason) > >> { > >> int ret = TC_ACT_UNSPEC; > >> #ifdef CONFIG_NET_CLS_ACT > >> @@ -3922,12 +3923,14 @@ static int tc_run(struct tcx_entry *entry, struct sk_buff *skb) > >> > >> tc_skb_cb(skb)->mru = 0; > >> tc_skb_cb(skb)->post_ct = false; > >> + res.drop_reason = *drop_reason; > >> > >> mini_qdisc_bstats_cpu_update(miniq, skb); > >> ret = tcf_classify(skb, miniq->block, miniq->filter_list, &res, false); > >> /* Only tcf related quirks below. */ > >> switch (ret) { > >> case TC_ACT_SHOT: > >> + *drop_reason = res.drop_reason; > > > > Daniel, > > > > Getting the following splat [1] with CONFIG_DEBUG_NET=y and this > > reproducer [2]. Problem seems to be that classifiers clear 'struct > > tcf_result::drop_reason', thereby triggering the warning in > > __kfree_skb_reason() due to reason being 'SKB_NOT_DROPPED_YET' (0). > > > > Fixed by maintaining the original drop reason if the one returned from > > tcf_classify() is 'SKB_NOT_DROPPED_YET' [3]. I can submit this fix > > unless you have a better idea. > > Thanks for catching this, looks reasonable to me as a fix. > > > [1] > > WARNING: CPU: 0 PID: 181 at net/core/skbuff.c:1082 kfree_skb_reason+0x38/0x130 > > Modules linked in: > > CPU: 0 PID: 181 Comm: mausezahn Not tainted 6.6.0-rc6-custom-ge43e6d9582e0 #682 > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014 > > RIP: 0010:kfree_skb_reason+0x38/0x130 > > [...] > > Call Trace: > > <IRQ> > > __netif_receive_skb_core.constprop.0+0x837/0xdb0 > > __netif_receive_skb_one_core+0x3c/0x70 > > process_backlog+0x95/0x130 > > __napi_poll+0x25/0x1b0 > > net_rx_action+0x29b/0x310 > > __do_softirq+0xc0/0x29b > > do_softirq+0x43/0x60 > > </IRQ> > > > > [2] > > #!/bin/bash > > > > ip link add name veth0 type veth peer name veth1 > > ip link set dev veth0 up > > ip link set dev veth1 up > > tc qdisc add dev veth1 clsact > > tc filter add dev veth1 ingress pref 1 proto all flower dst_mac 00:11:22:33:44:55 action drop > > mausezahn veth0 -a own -b 00:11:22:33:44:55 -q -c 1 > > I didn't know you're using mausezahn, nice :) > > > [3] > > diff --git a/net/core/dev.c b/net/core/dev.c > > index a37a932a3e14..abd0b13f3f17 100644 > > --- a/net/core/dev.c > > +++ b/net/core/dev.c > > @@ -3929,7 +3929,8 @@ static int tc_run(struct tcx_entry *entry, struct sk_buff *skb, > > /* Only tcf related quirks below. */ > > switch (ret) { > > case TC_ACT_SHOT: > > - *drop_reason = res.drop_reason; > > + if (res.drop_reason != SKB_NOT_DROPPED_YET) > > + *drop_reason = res.drop_reason; > > mini_qdisc_qstats_cpu_drop(miniq); > > break; > > case TC_ACT_OK: > > Out of curiosity - how does the policy say "drop" but drop_reason does not reflect it? cheers, jamal > When you submit feel free to add: > > Acked-by: Daniel Borkmann <daniel@xxxxxxxxxxxxx>