On 7/21/23 4:57 PM, Petr Machata wrote:
As of this patch (commit e420bed02507), TC qdisc installation and/or removal cause memory access issues in the system. A semi-minimal reproducer is: bash-5.2# ip l a name v1 type veth peer name v2 bash-5.2# ip l s dev v1 up bash-5.2# ip l s dev v2 up bash-5.2# tc q a dev v1 ingress bash-5.2# tc q d dev v1 ingress bash-5.2# tc q a dev v1 ingress bash-5.2# tc q d dev v1 ingress It's a bit finnicky, but only a little. For me, the first two "tc q" operations never triggered a splat. Then it could take a few "tc q a" "tc q d" iterations to get it to splat. So it looks like maybe the first "tc q d" is the problematic bit? And then there's some likelihood of failing on any following "tc q" operation. The above in particular produced three warning splats for me (attached as decoded.txt, decoded2.txt and decoded3.txt). Probing further: bash-5.2# tc q a dev v1 ingress Produced two more splats from KASAN (decoded4.txt and decoded5.txt), which look more serious. Further attempts to prod the system deadlock it, I guess because RTNL was left locked. Reverting e420bed02507, and fe20ce3a5126 + 55cc3768473e that fail to build without it, makes net-next/main work again.
Sorry about that, fix should be here: https://lore.kernel.org/netdev/20230721233330.5678-1-daniel@xxxxxxxxxxxxx/ Thanks, Daniel