On Fri, 2019-02-22 at 13:23 -0700, Jason Gunthorpe wrote: > On Fri, Feb 22, 2019 at 08:13:58PM +0000, Ertman, David M wrote: > > > From: Jason Gunthorpe [mailto:jgg@xxxxxxxx] > > > Sent: Thursday, February 21, 2019 11:35 AM > > > To: Saleem, Shiraz <shiraz.saleem@xxxxxxxxx> > > > Cc: dledford@xxxxxxxxxx; davem@xxxxxxxxxxxxx; linux- > > > rdma@xxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx; Ismail, Mustafa > > > <mustafa.ismail@xxxxxxxxx>; Kirsher, Jeffrey T < > > > jeffrey.t.kirsher@xxxxxxxxx>; > > > Patil, Kiran <kiran.patil@xxxxxxxxx>; Ertman, David M > > > <david.m.ertman@xxxxxxxxx> > > > Subject: Re: [RFC v1 01/19] net/i40e: Add peer register/unregister to > > > struct > > > i40e_netdev_priv > > > > > > On Thu, Feb 21, 2019 at 02:19:33AM +0000, Saleem, Shiraz wrote: > > > > > Subject: Re: [RFC v1 01/19] net/i40e: Add peer > > > > > register/unregister to > > > > > struct i40e_netdev_priv > > > > > > > > > > On Fri, Feb 15, 2019 at 11:10:48AM -0600, Shiraz Saleem wrote: > > > > > > Expose the register/unregister function pointers in the struct > > > > > > i40e_netdev_priv which is accesible via the netdev_priv() > > > > > > interface > > > > > > in the RDMA driver. On a netdev notification in the RDMA > > > > > > driver, > > > > > > the appropriate LAN driver register/unregister functions are > > > > > > invoked from the struct i40e_netdev_priv structure, > > > > > > > > > > Why? In later patches we get an entire device_add() based thing. > > > > > Why > > > > > do you need two things? > > > > > > > > > > The RDMA driver should bind to the thing that device_add created > > > > > and > > > > > from there reliably get the netdev. It should not listen to > > > > > netdev notifiers for > > > attachment. > > > > In the new IDC mechanism between ice<->irdma, the LAN driver setups > > > > up > > > > the device for us and attaches it to a software bus via > > > > device_add() based > > > mechanism. > > > > However, RDMA driver binds to the device only when the LAN > > > > 'register' > > > > function is called in irdma. > > > > > > That doesn't make sense. The PCI driver should always create the > > > required > > > struct device attachment point when attachment is becomes possible. > > > > > > > There is no ordering guarantee in which irdma, i40e and ice modules > > > > load. > > > > The netdev notifier is for the case where the irdma loads before > > > > i40e > > > > or ice. > > > > > > You are supposed to use the driver core to handle this ordering. > > > > > > The pci driver creates the attachment points in the correct order, > > > when they > > > are ready for use, and the driver core will automatically attach > > > registered > > > device drivers to the attachement points, no matter the module load > > > loader. > > > > > > You will have a netdev and a rdma attachment point, sounds like the > > > RDMA one > > > is created once the netdev is happy. > > > > > > Maybe what you are missing is a struct device_driver? > > > > > > Jason > > > > I am assuming that the term PCI driver is being used to mean the PCI > > subsystem in the kernel. If this assumption is wrong, please disregard > > the next > > paragraph, but the following points will still apply. > > No, I mean the driver that has the struct pci_driver for the PCI > function. Maybe that is the LAN driver for this case. Sorry for the delayed response, I was dealing with a death in the family. I want to make sure we are on the same page, so please correct me if I am wrong. The Intel RDMA driver is not a stand-alone PCI function driver because there is no separate PCI function for RDMA, so the RDMA driver does not call pci_register_driver(), that is done with the LAN driver. The Intel RDMA driver still needs to access to HW resources which is accessible only through the LAN PF driver (e.g. i40e/ice) and does not have its own netdev, so it uses the netdev exposed by the LAN PF driver. The access to the netdev is needed so that it can listen to RT netlink messages and other reasons. We refer to "software bus" because out hardware doe not expose the hardware bus, so our APIs to bus_register/unregister are actually using a software bus, which is exposed and managed by the LAN driver. > > bus, and has no ability to perform the described functions. The > > irdma driver cannot register with the software bus unless it > > registers with the LAN driver that controls the bus. The LAN > > driver's register function will call "driver_register(&drv->driver)" > > for the registering irdma driver. > > That isn't how to use the driver core. Software bus exposing and managing in other kernel drivers, like Many Integrated Core (MIC) software driver and virtio subsystem. > > Since the irdma driver is a consolidated driver (supports both ice and > > i40e LAN > > drivers), we cannot guarantee that a given LAN driver will load before > > the irdma > > driver. Even if we use module dependencies to make irdma depend on > > (ice || > > i40e), we have to consider the situation where a machine will have both > > an ice > > supported LAN device and an i40e supported LAN device in it. In this > > case, the > > load order could be (e.g.) i40e -> irdma -> ice. The irdma driver can > > check all > > present netdevs when it loads to find the one that has the correct > > function > > pointers in it, but it will have no way of knowing that a new software > > bus was > > created by the second LAN driver to load. > > This is why you use the driver core to manage driver binding. > > > This is why irdma is listening for netdev notifiers, so that whenever a > > new netdev > > appears from a LAN driver loading after irdma, the irdma driver can > > evaluate > > whether the new netdev was created by a LAN driver supported by irdma > > driver. > > Register a device driver to the driver core and wait for the driver > core to call that driver's probe method. Yes, the LAN PF driver is the software component exposing and managing the bus, so it is the one who will call probe/remove of the peer driver (RDMA driver). Although netdev notifiers based approach is needed if the RDMA driver was loaded first before the LAN PF driver (i40e or ice) is loaded. > > Jason
Attachment:
signature.asc
Description: This is a digitally signed message part