On 7/27/23 7:41 AM, Leon Romanovsky wrote:
On Wed, Jul 26, 2023 at 04:33:40PM -0700, Martin KaFai Lau wrote:
On 7/26/23 11:16 AM, Martin KaFai Lau wrote:
On 7/26/23 10:01 AM, Leon Romanovsky wrote:
On Wed, Jul 26, 2023 at 08:23:12AM -0700, Jakub Kicinski wrote:
On Wed, 26 Jul 2023 10:12:54 +0300 Leon Romanovsky wrote:
Thanks, I'll take a look this evening.
Did anybody post a fix for that?
We are experiencing the following kernel panic in netdev commit
b57e0d48b300 (net-next/main) Merge branch '100GbE' of
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Not that I know, looks like this is with Daniel's previous fix already
present, and syzbot is hitting it, too :(
My naive workaround which restored our regression runs is:
diff --git a/kernel/bpf/tcx.c b/kernel/bpf/tcx.c
index 69a272712b29..10c9ab830702 100644
--- a/kernel/bpf/tcx.c
+++ b/kernel/bpf/tcx.c
@@ -111,6 +111,7 @@ void tcx_uninstall(struct net_device *dev, bool ingress)
bpf_prog_put(tuple.prog);
tcx_skeys_dec(ingress);
}
- WARN_ON_ONCE(tcx_entry(entry)->miniq_active);
+ tcx_miniq_set_active(entry, false);
Thanks for the report. I will look into it.
I don't see how that may be triggered for now after Daniel's recent fix in
commit dc644b540a2d ("tcx: Fix splat in ingress_destroy upon
tcx_entry_free").
Both our regression and syzbot have this fix in the trees.
Do you have a small reproducible case? Thanks.
Unfortunately no.
Thanks for the report, we found the root cause and will send a fix in the next
day or two.
Best,
Daniel