Re: [PATCH net-next v3 6/6] net: lan966x: Add switchdev support

Vladimir Oltean <vladimir.oltean@xxxxxxx> · Tue, 14 Dec 2021 00:01:52 +0000

On Mon, Dec 13, 2021 at 10:24:50PM +0100, Horatiu Vultur wrote:
> The 12/13/2021 16:25, Vladimir Oltean wrote:
> >
> > On Mon, Dec 13, 2021 at 04:28:24PM +0100, Horatiu Vultur wrote:
> > > The 12/13/2021 14:29, Vladimir Oltean wrote:
> > > >
> > > > On Mon, Dec 13, 2021 at 03:26:56PM +0100, Horatiu Vultur wrote:
> > > > > > They are independent of each other. You deduce the interface on which
> > > > > > the notifier was emitted using switchdev_notifier_info_to_dev() and act
> > > > > > upon it, if lan966x_netdevice_check() is true. The notifier handling
> > > > > > code itself is stateless, all the state is per port / per switch.
> > > > > > If you register one notifier handler per switch, lan966x_netdevice_check()
> > > > > > would return true for each notifier handler instance, and you would
> > > > > > handle each event twice, would you not?
> > > > >
> > > > > That is correct, I will get the event twice which is a problem in the
> > > > > lan966x. The function lan966x_netdevice_check should be per instance, in
> > > > > this way each instance can filter the events.
> > > > > The reason why I am putting the notifier_block inside lan966x is to be
> > > > > able to get to the instance of lan966x even if I get a event that is not
> > > > > for lan966x port.
> > > >
> > > > That isn't a problem, every netdevice notifier still sees all events.
> > >
> > > Yes, that is correct.
> > > Sorry maybe I am still confused, but some things are still not right.
> > >
> > > So lets say there are two lan966x instances(A and B) and each one has 2
> > > ports(ethA0, ethA1, ethB0, ethB1).
> > > So with the current behaviour, if for example ethA0 is added in vlan
> > > 100, then we get two callbacks for each lan966x instance(A and B) because
> > > each of them registered. And because of lan966x_netdevice_check() is true
> > > for ethA0 will do twice the work.
> > > And you propose to have a singleton notifier so we get only 1 callback
> > > and will be fine for this case. But if you add for example the bridge in
> > > vlan 200 then I will never be able to get to the lan966x instance which
> > > is needed in this case.
> >
> > I'm not sure what you mean by "you add the bridge in vlan 200" with
> > respect to netdevice notifiers. Create an 8021q upper with VID 200 on
> > top of a bridge (as that would generate a NETDEV_CHANGEUPPER)?
>
> I meant the following:
>
> ip link add name brA type bridge
> ip link add name brB type bridge
> ip link set dev ethA0 master brA
> ip link set dev ethA1 master brA
> ip link set dev ethB0 master brB
> ip link set dev ethB1 master brB
> bridge vlan add dev brA vid 200 self

Ok, so the same as this use case and patch posted by Florian for DSA:
https://lkml.org/lkml/2018/6/24/300
we should be getting back to it some day.

> After the last command both lan966x instances will get a switchdev blocking
> event where event is SWITCHDEV_PORT_OBJ_ADD and obj->id is
> SWITCHDEV_OBJ_ID_PORT_VLAN. And in this case the
> switchdev_notifier_info_to_dev will return brA.

It returns brA anyway. But the point being, your current code submission
is something like this (of course, I had to fish these two functions
from two different patches, because they still aren't properly split):

static int lan966x_vlan_cpu_add_vlan_mask(struct lan966x *lan966x, u16 vid)
{
	lan966x->vlan_mask[vid] |= BIT(CPU_PORT);
	return lan966x_vlan_set_mask(lan966x, vid);
}

static int lan966x_handle_port_vlan_add(struct net_device *dev,
					struct notifier_block *nb,
					const struct switchdev_obj_port_vlan *v)
{
	struct lan966x_port *port;
	struct lan966x *lan966x;

	/* When adding a port to a vlan, we get a callback for the port but
	 * also for the bridge. When get the callback for the bridge just bail
	 * out. Then when the bridge is added to the vlan, then we get a
	 * callback here but in this case the flags has set:
	 * BRIDGE_VLAN_INFO_BRENTRY. In this case it means that the CPU
	 * port is added to the vlan, so the broadcast frames and unicast frames
	 * with dmac of the bridge should be foward to CPU.
	 */
	if (netif_is_bridge_master(dev) &&
	    !(v->flags & BRIDGE_VLAN_INFO_BRENTRY))
		return 0;

	lan966x = container_of(nb, struct lan966x, switchdev_blocking_nb);

	/* In case the port gets called */
	if (!(netif_is_bridge_master(dev))) {
		if (!lan966x_netdevice_check(dev))
			return -EOPNOTSUPP;

		port = netdev_priv(dev);
		return lan966x_vlan_port_add_vlan(port, v->vid,
						  v->flags & BRIDGE_VLAN_INFO_PVID,
						  v->flags & BRIDGE_VLAN_INFO_UNTAGGED);
	}

	/* In case the bridge gets called */
	if (netif_is_bridge_master(dev))
		return lan966x_vlan_cpu_add_vlan(lan966x, dev, v->vid);
		~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
		In which way does this function call, exactly, check
		your lan966x's relationship to that bridge?

	return 0;
}

My point being, if you have two veth interfaces in your system just
minding their own business and being put in an unrelated bridge, and
that bridge would be put in VLAN 100 too, my understanding is that your
lan966x driver would sniff that event and add its CPU port to VLAN 100
too.  The reverse is true as well: any removal of a bridge from a VLAN
would also cause your CPU port to stop being in that VLAN, no matter
what interfaces may be in that VLAN. How could I say this... "spooky
action at a distance".

> > If there's a netdevice event on a bridge, the singleton netdevice event
> > handler can see if it is a bridge (netif_is_bridge_master), and if it
> > is, it can crawl through the bridge's lower interfaces using
> > netdev_for_each_lower_dev to see if there is any lan966x interface
> > beneath it. If there isn't, nothing to do. Otherwise, you get the
> > opportunity to do something for each port under that bridge.
>
> If I start to use switchdev_handle_port_obj_add, then as you mention
> will get a callback for each interface under the port and then I need to
> look in obj->orig_dev to see if it was a bridge or was a port that was
> part of the bridge.

Oh yes of course. And right now you don't need that because? You think
you get notifications only of switchdev events emitted by bridges that
you have a port in?

> If I don't use switchdev_handle_port_obj_add and implement own function
> then there is no way to get to the lan966x instance.

The switchdev_handle_port_obj_add() function isn't magic, and it has an
actual public implementation, too. Sure you can get to the lan966x
instance even if you don't use switchdev_handle_port_obj_add() -
although, it is there for people to use it.

> I will get two callbacks but then they can be filtered based on the
> bridge. If I use switchdev_handle_port_obj_add then if I have 2 ports
> under the bridge, both ports will be called which will do the same
> work anyway.

And that's a good thing, if you actually think about how you design
things to actually work. Please consider that you have two distinct
events: you can join a bridge that is in a VLAN, or the bridge can join
that VLAN while you have some ports under it. The invariant is that your
CPU port needs to be in that VLAN only for as long as there is any port
under that bridge. So it is actually beneficial to use the
switchdev_handle_* helper. It tells you how many users of the CPU VLAN
rule there still are. It would be broken to delete it right away, when a
port leaves the bridge. It would also be broken to not delete it after
all ports leave: the bridge may have a longer lifetime than the lan966x
ports beneath them, so there may not be any deletion event that you
should expect.

> So I am not sure how much I will benefit of using
> switchdev_handle_port_obj_add in this case.
>
> One important observation in the driver is that I am separating these 2
> cases:
>
> bridge vlan add dev brA vid 300 self
> bridge vlan add dev ethA0 vid 400

Understood, and that's ok. But I'm not convinced it works, though.

> > Maybe I'm not understanding what you're trying to accomplish that's different?
>
> The reason is that I want to use brA to control the CPU port. To decide
> which frames to be copy to the CPU. Also to copy as few as possible
> frames to CPU.
>
> If we still want to go with the approach of using a singleton notifier
> block, then we will still have a problem for netdevice notifier block.
> We will get the same issue, can't get to lan966x instance in case the
> lan966x callback is called for a different device. And we need this for
> the following case:
>
> If for example eth0, eth1 are part of a different IP and eth2, eth3 are
> part of lan966x. We would like not to be able to put under the same
> bridge interfaces that are part of different IPs (more precisely,
> lan966x interfaces can be only under a bridge where lan966x interfaces
> are part).
>
> For example the following command should fail:
> ip link add name br0 type bridge
> ip link set dev eth0 master br0
> ip link set dev eth2 master br0
>
> Also the this command should fail:
> ip link add name br0 type bridge
> ip link set dev eth2 master br0
> ip link set dev eth0 master br0
>
> But the following should be accepted:
> ip link add name br0 type bridge
> ip link set dev eth0 master br0
> ip link set dev eth1 master br0
> ip link add name br1 type bridge
> ip link set dev eth2 master br1
> ip link set dev eth3 master br1

You can track NETDEV_PRECHANGEUPPER and deny that, and also provide an
extack with a reason. That should work, it's been tested.

> Maybe I should also make it explictly that is not allow to have more
> than one instance of lan966x for now. And once is needed then add
> support for it.

I don't necessarily see the reason for this, but ok. I don't think you
should view things as "support for parallel instances of the driver is
what's complicating the implementation", but rather "catching the events
in all the permutations that they can happen in is what this driver
needs to provide a good user experience".