On 2/29/2012 5:56 AM, Jamal Hadi Salim wrote: > On Tue, 2012-02-28 at 20:40 -0800, John Fastabend wrote: > >> OK back to this. The last piece is where to put these messages... >> we could take PF_ROUTE:RTM_*NEIGH >> >> PF_ROUTE:RTM_NEWNEIGH - Add a new FDB entry to an offloaded >> switch. >> PF_ROUTE:RTM_DELNEIGH - Delete a FDB entry from an offlaoded >> switch. >> PF_ROUTE:RTM_GETNEIGH - Dumps the embedded FDB table >> > > Why RTM_*NEIGH? RTM tends to map to Route/L3 and NEIGH tends to map > to ndisc or ARP both tied to IP address resolution. While both ARP/Ndisc > may play a role in the user space app populating the FDB, i dont think > they are necessary players. > Learning could be via a table entry miss and packet redirect to user > space. > So my suggestion is to use FDB_*ENTRY for names > Well I think NETLINK_ROUTE is the most correct type to use in this case. Per netlink.h its for routing and device hooks. #define NETLINK_ROUTE 0 /* Routing/device hook */ And NETLINK_ROUTE msg_types use the RTM_* prefix. The _*NEIGH postfix were merely a copy from the SW BRIDGE code paths. How about, PF_BRIDGE:RTM_FDB_NEWENTRY PF_BRIDGE:RTM_FDB_DELENTRY PF_BRIDGE:RTM_FDB_GETENTRY And a new group RTNLGRP_FDB. Also using NETLINK_ROUTE gives the correct rtnl locking semantics for free. >> The neighbor code is using the PF_UNSPEC protocol type so we won't >> collide with these unless someone was using PF_ROUTE and relying on >> falling back to PF_UNSPEC however I couldn't find any programs that >> did this iproute2 certainly doesn't. And the bridge pieces are using >> PF_BRIDGE so no collision there. > > They have to be different calls from the calls that talk to the s/ware > bridge. In my opinion, as controversial as this may sound, you need to > be flexible enough that some vendor can replace these calls with > proprietary calls which are more efficient for their hardware. So a > "plugin" to replace these calls in the user space code would be a > good idea. Alternatively, you could make that something they do at > the driver level i.e from user space to kernel it is "hardware, please > addthistotheFDBtable()" call and the implementation of that could be > proprietary to the specific hardware. > Agreed. I think adding some ndo_ops for bridging offloads here would work. For example the DSA infrastructure and/or macvlan devices might need this. Along the lines of extending this RFC, [RFC] hardware bridging support for DSA switches http://patchwork.ozlabs.org/patch/16578/ .John -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html