Re: [RFC PATCH v0 1/2] net: bridge: propagate FDB table into hardware

John Fastabend <john.r.fastabend@xxxxxxxxx> · Wed, 29 Feb 2012 09:25:56 -0800

On 2/29/2012 5:56 AM, Jamal Hadi Salim wrote:
> On Tue, 2012-02-28 at 20:40 -0800, John Fastabend wrote:
> 
>> OK back to this. The last piece is where to put these messages...
>> we could take PF_ROUTE:RTM_*NEIGH
>>
>>      PF_ROUTE:RTM_NEWNEIGH - Add a new FDB entry to an offloaded
>>                              switch.
>>      PF_ROUTE:RTM_DELNEIGH - Delete a FDB entry from an offlaoded
>>                              switch.
>>      PF_ROUTE:RTM_GETNEIGH - Dumps the embedded FDB table
>>
> 
> Why RTM_*NEIGH? RTM tends to map to Route/L3 and NEIGH tends to map
> to ndisc or ARP both tied to IP address resolution. While both ARP/Ndisc
> may play a role in the user space app populating the FDB, i dont think
> they are necessary players.
> Learning could be via a table entry miss and packet redirect to user
> space.
> So my suggestion is to use FDB_*ENTRY for names
>  

Well I think NETLINK_ROUTE is the most correct type to use in this
case. Per netlink.h its for routing and device hooks.

#define NETLINK_ROUTE           0       /* Routing/device hook                          */

And NETLINK_ROUTE msg_types use the RTM_* prefix. The _*NEIGH postfix
were merely a copy from the SW BRIDGE code paths. How about,

PF_BRIDGE:RTM_FDB_NEWENTRY
PF_BRIDGE:RTM_FDB_DELENTRY
PF_BRIDGE:RTM_FDB_GETENTRY

And a new group RTNLGRP_FDB. Also using NETLINK_ROUTE gives the correct
rtnl locking semantics for free.

>> The neighbor code is using the PF_UNSPEC protocol type so we won't
>> collide with these unless someone was using PF_ROUTE and relying on
>> falling back to PF_UNSPEC however I couldn't find any programs that
>> did this iproute2 certainly doesn't. And the bridge pieces are using
>> PF_BRIDGE so no collision there.
> 
> They have to be different calls from the calls that talk to the s/ware
> bridge. In my opinion, as controversial as this may sound, you need to
> be flexible enough that some vendor can replace these calls with
> proprietary calls which are more efficient for their hardware. So a
> "plugin" to replace these calls in the user space code would be a 
> good idea. Alternatively, you could make that something they do at
> the driver level i.e from user space to kernel it is "hardware, please
> addthistotheFDBtable()" call and the implementation of that could be
> proprietary to the specific hardware.
> 

Agreed. I think adding some ndo_ops for bridging offloads here would
work. For example the DSA infrastructure and/or macvlan devices might
need this. Along the lines of extending this RFC,

[RFC] hardware bridging support for DSA switches
http://patchwork.ozlabs.org/patch/16578/

.John

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html