On 7/12/23 09:52, Nikolay Aleksandrov wrote:
On 12/07/2023 17:48, Ido Schimmel wrote:
On Tue, Jul 11, 2023 at 04:54:15PM -0700, Kuniyuki Iwashima wrote:
When we create an L2 loop on a bridge in netns, we will see packets storm
even if STP is enabled.
# unshare -n
# ip link add br0 type bridge
# ip link add veth0 type veth peer name veth1
# ip link set veth0 master br0 up
# ip link set veth1 master br0 up
# ip link set br0 type bridge stp_state 1
# ip link set br0 up
# sleep 30
# ip -s link show br0
2: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether b6:61:98:1c:1c:b5 brd ff:ff:ff:ff:ff:ff
RX: bytes packets errors dropped missed mcast
956553768 12861249 0 0 0 12861249 <-. Keep
TX: bytes packets errors dropped carrier collsns | increasing
1027834 11951 0 0 0 0 <-' rapidly
This is because llc_rcv() drops all packets in non-root netns and BPDU
is dropped.
Let's show an error when enabling STP in netns.
# unshare -n
# ip link add br0 type bridge
# ip link set br0 type bridge stp_state 1
Error: bridge: STP can't be enabled in non-root netns.
Note this commit will be reverted later when we namespacify the whole LLC
infra.
Fixes: e730c15519d0 ("[NET]: Make packet reception network namespace safe")
Suggested-by: Harry Coin <hcoin@xxxxxxxxxxxxxxxxx>
I'm not sure that's accurate. I read his response in the link below and
he says "I'd rather be warned than blocked" and "But better warned and
awaiting a fix than blocked", which I agree with. The patch has the
potential to cause a lot of regressions, but without actually fixing the
problem.
How about simply removing the error [1]? Since iproute2 commit
844c37b42373 ("libnetlink: Handle extack messages for non-error case"),
it can print extack warnings and not only errors. With the diff below:
# unshare -n
# ip link add name br0 type bridge
# ip link set dev br0 type bridge stp_state 1
Warning: bridge: STP can't be enabled in non-root netns.
# echo $?
0
[1]
diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
index a807996ac56b..b5143de37938 100644
--- a/net/bridge/br_stp_if.c
+++ b/net/bridge/br_stp_if.c
@@ -201,10 +201,8 @@ int br_stp_set_enabled(struct net_bridge *br, unsigned long val,
{
ASSERT_RTNL();
- if (!net_eq(dev_net(br->dev), &init_net)) {
+ if (!net_eq(dev_net(br->dev), &init_net))
NL_SET_ERR_MSG_MOD(extack, "STP can't be enabled in non-root netns");
- return -EINVAL;
- }
if (br_mrp_enabled(br)) {
NL_SET_ERR_MSG_MOD(extack,
I'd prefer this approach to changing user-visible behaviour and potential regressions.
Just change the warning message.
Thanks,
Nik
Remember, the only way it's honest to 'warn but not block' STP in netns
is trust in the Kuniyuki's assertion that the llc will be
'namespacified' in a near term frame. As STP is not only
non-functional in a netns, but will in fact bring down every connected
system in a packet storm should a L2 loop exist the situation is much
worse than a merely inaccessible extra feature. This as the only reason
STP exists is to avoid crashing sites owing to packet storms arising
from L2 loops. I think as this bug is a potential 'site killer'
(which in fact happened to me!) The Linux dev community has an
obligation to either hard block this or commit to a fix time frame and
merely warn.
Thanks
Harry Coin