Hi Jakub, > -----Original Message----- > From: Jakub Kicinski <jakub.kicinski@xxxxxxxxxxxxx> > Sent: Thursday, November 7, 2019 2:33 PM > To: Parav Pandit <parav@xxxxxxxxxxxx> > Cc: alex.williamson@xxxxxxxxxx; davem@xxxxxxxxxxxxx; > kvm@xxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx; Saeed Mahameed > <saeedm@xxxxxxxxxxxx>; kwankhede@xxxxxxxxxx; leon@xxxxxxxxxx; > cohuck@xxxxxxxxxx; Jiri Pirko <jiri@xxxxxxxxxxxx>; linux- > rdma@xxxxxxxxxxxxxxx; Or Gerlitz <gerlitz.or@xxxxxxxxx> > Subject: Re: [PATCH net-next 00/19] Mellanox, mlx5 sub function support > > On Thu, 7 Nov 2019 10:04:48 -0600, Parav Pandit wrote: > > Mellanox sub function capability allows users to create several > > hundreds of networking and/or rdma devices without depending on PCI SR- > IOV support. > > You call the new port type "sub function" but the devlink port flavour is mdev. > Sub function is the internal driver structure. The abstract entity at user and stack level is mdev. Hence the port flavour is mdev. > As I'm sure you remember you nacked my patches exposing NFP's PCI sub > functions which are just regions of the BAR without any mdev capability. Am I > in the clear to repost those now? Jiri? > For sure I didn't nack it. :-) What I remember discussing offline/mailing list is (a) exposing mdev/sub fuctions as devlink sub ports is not so good abstraction (b) user creating/deleting eswitch sub ports would be hard to fit in the whole usage model > > Overview: > > --------- > > Mellanox ConnectX sub functions are exposed to user as a mediated > > device (mdev) [2] as discussed in RFC [3] and further during > > netdevconf0x13 at [4]. > > > > mlx5 mediated device (mdev) enables users to create multiple > > netdevices and/or RDMA devices from single PCI function. > > > > Each mdev maps to a mlx5 sub function. > > mlx5 sub function is similar to PCI VF. However it doesn't have its > > own PCI function and MSI-X vectors. > > > > mlx5 mdevs share common PCI resources such as PCI BAR region, MSI-X > > interrupts. > > > > Each mdev has its own window in the PCI BAR region, which is > > accessible only to that mdev and applications using it. > > > > Each mlx5 sub function has its own resource namespace for RDMA resources. > > > > mdevs are supported when eswitch mode of the devlink instance is in > > switchdev mode described in devlink documentation [5]. > > So presumably the mdevs don't spawn their own devlink instance today, but > once mapped via VIRTIO to a VM they will create one? > mdev doesn't spawn the devlink instance today when mdev is created by user, like PCI. When PCI bus driver enumerates and creates PCI device, there isn't a devlink instance for it. But, mdev's devlink instance is created when mlx5_core driver binds to the mdev device. (again similar to PCI, when mlx5_core driver binds to PCI, its devlink instance is created ). I should have put the example in patch-15 which creates/deletes devlink instance of mdev. I will revise the commit log of patch-15 to include that. Good point. > It could be useful to specify. > Yes, its certainly useful. I missed to put the example in commit log of patch-15. > > Network side: > > - By default the netdevice and the rdma device of mlx5 mdev cannot > > send or receive any packets over the network or to any other mlx5 mdev. > > Does this mean the frames don't fall back to the repr by default? Probably I wasn't clear. What I wanted to say is, that frames transmitted by mdev's netdevice and rdma devices don't go to network. These frames goes to representor device. User must configure representor to send/receive/steer traffic to mdev.