On 2/24/2020 1:23 PM, Jason Gunthorpe wrote: > On Mon, Feb 24, 2020 at 07:01:43PM +0000, Parav Pandit wrote: >> On 2/24/2020 12:29 PM, Jason Gunthorpe wrote: >>> On Mon, Feb 24, 2020 at 12:52:06PM +0200, Leon Romanovsky wrote: >>>>> Are you asking why bonding should be implemented as dedicated >>>>> ulp/driver, and not as an extension by the vendor driver? >>>> >>>> No, I meant something different. You are proposing to combine IB >>>> devices, while keeping netdev devices separated. I'm asking if it is >>>> possible to combine netdev devices with already existing bond driver >>>> and simply create new ib device with bond netdev as an underlying >>>> provider. >>> >>> Isn't that basically what we do now in mlx5? >>> >> And its broken for few aspects that I described in Q&A question-1 in >> this thread previously. >> >> On top of that user has no ability to disable rdma bonding. > > And what does that mean? The real netdevs have no IP addreses so what > exactly does a non-bonded RoCEv2 RDMA device do? > Not sure. There is some default GID on it. >> User exactly asked us that they want to disable in some cases. >> (not on mailing list). So there are non-upstream hacks exists that is >> not applicable for this discussion. > > Bah, I'm aware of that - that request is hack solution to something > else as well. > >>> Logically the ib_device is attached to the bond, it uses the bond for >>> IP addressing, etc. >>> >>> I'm not sure trying to have 3 ib_devices like netdev does is sane, >>> that is very, very complicated to get everything to work. The ib stuff >>> just isn't designed to be stacked like that. >>> >> I am not sure I understand all the complications you have thought through. >> I thought of few and put forward in the Q&A in the thread and we can >> improve the design as we go forward. >> >> Stacking rdma device on top of existing rdma device as an ib_client so >> that rdma bond device exactly aware of what is going on with slaves and >> bond netdev. > > How do you safely proxy every single op from the bond to slaves? Bond config should tell which slave to use, instead of current implicit one. > > How do you force the slaves to allow PDs to be shared across them? > For slave it doesn't matter if caller is bond or direct. > What provider lives in user space for the bond driver? What happens to > the udata/etc? Same as that of primary slave used for pci, irq access whose info is provided through new netlink discovery path. > > And it doesn't solve the main problem you raised, creating a IB device > while holding RTNL simply should not ever be done. Moving this code > into the core layer fixed it up significantly for the similar rxe/siw > cases, I expect the same is possible for the LAG situation too. > $ rdma add dev command won't hold rtnl lock, when new rdma bond device is added through rdma netlink socket.