Re: net: switchdev: Port-to-Bridge Synchronization

Vladimir Oltean <olteanv@xxxxxxxxx> · Wed, 31 Jan 2024 15:10:48 +0200

On Tue, Jan 30, 2024 at 10:23:34PM +0100, Tobias Waldekranz wrote:
> On mån, jan 29, 2024 at 14:17, Vladimir Oltean <olteanv@xxxxxxxxx> wrote:
> > I'm thinking we could do something like this in br_switchdev_mdb_replay().
> > We could block concurrent callers to br_switchdev_mdb_notify() by
> > acquiring br->multicast_lock, so that br->mdb_list stays constant for a
> > while.
> >
> > Then, after constructing the local mdb_list, the problem is that it
> > still contains some SWITCHDEV_F_DEFER elements which are pending a call
> > switchdev_deferred_process(). But that can't run currently, so we can
> > iterate through switchdev's "deferred" list and remove them from the
> > replay list, if we figure out some sort of reliable switchdev object
> > comparison function. Then we can safely release br->multicast_lock.
> 
> That would _almost_ work, I think. But instead of purging the deferred
> items, I think we have to skip queuing the replay events in these
> cases. Otherwise we limit the scope of the notification to the
> requesting driver, when it ought to reach all registered callbacks on
> the notifier chain.
> 
> This matters if a driver wants to handle foreign group memberships the
> same way we do with FDB entries, which I want to add to DSA once this
> race has been taken care of.

Yes, not purging the deferred items (for the reasons you mention), but
as I said, "remove them from the replay list" (the local mdb_list).
Similar to what you've done in the series you just posted
(https://lore.kernel.org/netdev/20240131123544.462597-3-tobias@xxxxxxxxxxxxxx/),
only that instead of removing from the list, what you do is you never
call list_add_tail().

> > The big problem I see is that FDB notifications don't work that way.
> > They are also deferred, but all the deferred processing is on the
> > switchdev driver side, not on the bridge side. One of the reasons is
> > that the deferred side should not need rtnl_lock() at all. I don't see
> > how this model would scale to FDB processing.
> 
> Yeah, that's where I threw in the towel as well. That issue will just
> have to go unsolved for now.

I mean I have some ideas for FDB replays as well, but they all revolve
around making the interface more similar to how MDBs are handled, while
also maintaining the rtnl-lockless behavior. It's a lot more rework,
potentially involves maintaining 2 parallel mechanisms for switchdev FDB
notifications, and very likely something to handle in net-next.