Re: [RFC PATCH v0 1/2] net: bridge: propagate FDB table into hardware

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/29/2012 9:52 AM, Stephen Hemminger wrote:
> On Wed, 29 Feb 2012 09:25:56 -0800
> John Fastabend <john.r.fastabend@xxxxxxxxx> wrote:
> 
>> On 2/29/2012 5:56 AM, Jamal Hadi Salim wrote:
>>> On Tue, 2012-02-28 at 20:40 -0800, John Fastabend wrote:
>>>
>>>> OK back to this. The last piece is where to put these messages...
>>>> we could take PF_ROUTE:RTM_*NEIGH
>>>>
>>>>      PF_ROUTE:RTM_NEWNEIGH - Add a new FDB entry to an offloaded
>>>>                              switch.
>>>>      PF_ROUTE:RTM_DELNEIGH - Delete a FDB entry from an offlaoded
>>>>                              switch.
>>>>      PF_ROUTE:RTM_GETNEIGH - Dumps the embedded FDB table
>>>>
>>>
>>> Why RTM_*NEIGH? RTM tends to map to Route/L3 and NEIGH tends to map
>>> to ndisc or ARP both tied to IP address resolution. While both ARP/Ndisc
>>> may play a role in the user space app populating the FDB, i dont think
>>> they are necessary players.
>>> Learning could be via a table entry miss and packet redirect to user
>>> space.
>>> So my suggestion is to use FDB_*ENTRY for names
>>>  
>>
>> Well I think NETLINK_ROUTE is the most correct type to use in this
>> case. Per netlink.h its for routing and device hooks.
>>
>> #define NETLINK_ROUTE           0       /* Routing/device hook                          */
>>
>> And NETLINK_ROUTE msg_types use the RTM_* prefix. The _*NEIGH postfix
>> were merely a copy from the SW BRIDGE code paths. How about,
>>
>> PF_BRIDGE:RTM_FDB_NEWENTRY
>> PF_BRIDGE:RTM_FDB_DELENTRY
>> PF_BRIDGE:RTM_FDB_GETENTRY
>>
>> And a new group RTNLGRP_FDB. Also using NETLINK_ROUTE gives the correct
>> rtnl locking semantics for free.
>>
>>>> The neighbor code is using the PF_UNSPEC protocol type so we won't
>>>> collide with these unless someone was using PF_ROUTE and relying on
>>>> falling back to PF_UNSPEC however I couldn't find any programs that
>>>> did this iproute2 certainly doesn't. And the bridge pieces are using
>>>> PF_BRIDGE so no collision there.
>>>
>>> They have to be different calls from the calls that talk to the s/ware
>>> bridge. In my opinion, as controversial as this may sound, you need to
>>> be flexible enough that some vendor can replace these calls with
>>> proprietary calls which are more efficient for their hardware. So a
>>> "plugin" to replace these calls in the user space code would be a 
>>> good idea. Alternatively, you could make that something they do at
>>> the driver level i.e from user space to kernel it is "hardware, please
>>> addthistotheFDBtable()" call and the implementation of that could be
>>> proprietary to the specific hardware.
>>>
>>
>> Agreed. I think adding some ndo_ops for bridging offloads here would
>> work. For example the DSA infrastructure and/or macvlan devices might
>> need this. Along the lines of extending this RFC,
>>
>> [RFC] hardware bridging support for DSA switches
>> http://patchwork.ozlabs.org/patch/16578/
> 
> I want to see a unified API so that user space control applications (RSTP, TRILL?)
> can use one set of netlink calls for both software bridge and hardware offloaded
> bridges.  Does this proposal meet that requirement?
> 

With the patches I sent out last night the same netlink calls are used
for both SW and HW with a flag set in ndm_flags to indicate it is a hardware
entry. The flag is needed when a port has offload support and is also
a slave of a SW bridge. Another option would be to apply the command to both
hardware and software tables. This might be good enough and user space would
not have to make distinctions between HW and SW bridges. Also helps with my
original use case where I want the SW and HW bridge FDBs to be in sync.

In response to Jamal's comment I proposed changing the type to RTM_FDB_XXXENTRY
but the message contents are the same in both cases.

Jamal, so why do "They have to be different calls"? I'm not so sure anymore...
moving to RTM_FDB_XXXENTRY saved some refactoring in the bridge module but that
is just cosmetic.

Thanks,
John
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux