On 2/24/2020 4:52 AM, Leon Romanovsky wrote: > On Mon, Feb 24, 2020 at 07:10:13AM +0000, Parav Pandit wrote: >> Hi Leon, >> >> On 2/23/2020 3:49 AM, Leon Romanovsky wrote: >>> On Sun, Feb 23, 2020 at 12:44:12AM +0000, Parav Pandit wrote: >>>> Hi Jason, Weihang, >>>> >>>> On 2/22/2020 5:40 PM, Jason Gunthorpe wrote: >>>>> On Sat, Feb 22, 2020 at 11:48:04AM +0800, Weihang Li wrote: >>>>>> Hi all, >>>>>> >>>>>> We plan to implement LAG in hns drivers recently, and as we know, there is >>>>>> already a mature and stable solution in the mlx5 driver. Considering that >>>>>> the net subsystem in kernel adopt the strategy that the framework implement >>>>>> bonding, we think it's reasonable to add LAG feature to the ib core based >>>>>> on mlx5's implementation. So that all drivers including hns and mlx5 can >>>>>> benefit from it. >>>>>> >>>>>> In previous discussions with Leon about achieving reporting of ib port link >>>>>> event in ib core, Leon mentioned that there might be someone trying to do >>>>>> this. >>>>>> >>>>>> So may I ask if there is anyone working on LAG in ib core or planning to >>>>>> implement it in near future? I will appreciate it if you can share your >>>>>> progress with me and maybe we can finish it together. >>>>>> >>>>>> If nobody is working on this, our team may take a try to implement LAG in >>>>>> the core. Any comments and suggestion are welcome. >>>>> >>>>> This is something that needs to be done, I understand several of the >>>>> other drivers are going to want to use LAG and we certainly can't have >>>>> everything copied into each driver. >>>>> >>>>> Jason >>>>> >>>> I am not sure mlx5 is right model for new rdma bond device support which >>>> I tried to highlight in Q&A-1 below. >>>> >>>> I have below not-so-refined proposal for rdma bond device. >>>> >>>> - Create a rdma bond device named rbond0 using two slave rdma devices >>>> mlx5_0 mlx5_1 which is connected to netdevice bond1 and underlying dma >>>> device of mlx5_0 rdma device. >>>> >>>> $ rdma dev add type bond name rbond0 netdev bond1 slave mlx5_0 slave >>>> mlx5_1 dmadevice mlx5_0 >>>> >>>> $ rdma dev show >>>> 0: mlx5_0: node_type ca fw 12.25.1020 node_guid 248a:0703:0055:4660 >>>> sys_image_guid 248a:0703:0055:4660 >>>> 1: mlx5_1: node_type ca fw 12.25.1020 node_guid 248a:0703:0055:4661 >>>> sys_image_guid 248a:0703:0055:4660 >>>> 2: rbond0: node_type ca node_guid 248a:0703:0055:4660 sys_image_guid >>>> 248a:0703:0055:4660 >>>> >>>> - This should be done via rdma bond driver in >>>> drivers/infiniband/ulp/rdma_bond >>> >>> Extra question, why do we need RDMA bond ULP which combines ib devices >>> and not create netdev bond device and create one ib device on top of that? >>> >> >> I read your question few times, but I likely don't understand your question. >> >> Are you asking why bonding should be implemented as dedicated >> ulp/driver, and not as an extension by the vendor driver? > > No, I meant something different. You are proposing to combine IB > devices, while keeping netdev devices separated. I'm asking if it is > possible to combine netdev devices with already existing bond driver > and simply create new ib device with bond netdev as an underlying > provider. > Ah I understand now. Yes, if bond_newlink() can be extended to accept rdma device specific parameters, it should be possible. For example we already have two class of users where one wants to have rdma bond device created and one doesn't want to create for a given bond netdev. You might be already aware that newlink() is called under rtnl lock and ib_register_device() also need to acquire rtnl lock. But this is implementation that can be discussed later once bond driver extension for rdma make sense. :-) If one wants to create a vlan netdevice on top of bond device, user usually creates it after bond netdevice is created. Similarly I see that bond rdma device is composed of underlying bond netdev and underlying rdma device or at least its parent 'struct device' for purpose of irq routing, pci access etc. Compare to overloading bond_newlink() with rdma parameters, independent rdma bond device creation linked to bond netdev, appears more elegant and modular approach. > I'm not suggesting to implement anything in vendor drivers. > >> >> In an alternative approach a given hw driver implements a newlink() >> instead of dedicated bond driver. >> However this will duplicate the code in few drivers. Hence, to do in >> common driver and implement only necessary hooks in hw driver. > > I'm not sure about it. > >> >>> Thanks >>> >>