On Tue, Nov 22, 2022 at 10:59 PM Leon Romanovsky <leon@xxxxxxxxxx> wrote: > > On Tue, Nov 22, 2022 at 07:02:45AM -0800, Ajit Khaparde wrote: > > On Wed, Nov 16, 2022 at 5:22 AM Leon Romanovsky <leon@xxxxxxxxxx> wrote: > > > > > ::snip:: > > > > > All PCI management logic and interfaces are needed to be inside eth part > > > > > of your driver and only that part should implement SR-IOV config. Once > > > > > user enabled SR-IOV, the PCI driver should create auxiliary devices for > > > > > each VF. These device will have RDMA capabilities and it will trigger RDMA > > > > > driver to bind to them. > > > > I agree and once the PF creates the auxiliary devices for the VF, the RoCE > > > > Vf indeed get probed and created. But the twist in bnxt_en/bnxt_re > > > > design is that > > > > the RoCE driver is responsible for making adjustments to the RoCE resources. > > > > > > You can still do these adjustments by checking type of function that > > > called to RDMA .probe. PCI core exposes some functions to help distinguish between > > > PF and VFs. > > > > > > > > > > > So once the VF's are created and the bnxt_en driver enables SRIOV adjusts the > > > > NIC resources for the VF, and such, it tries to call into the bnxt_re > > > > driver for the > > > > same purpose. > > > > > > If I read code correctly, all these resources are for one PCI function. > > > > > > Something like this: > > > > > > bnxt_re_probe() > > > { > > > ... > > > if (is_virtfn(p)) > > > bnxt_re_sriov_config(p); > > > ... > > > } > > I understand what you are suggesting. > > But what I want is a way to do this in the context of the PF > > preferably before the VFs are probed. > > I don't understand the last sentence. You call to this sriov_config in > bnxt_re driver without any protection from VFs being probed, Let me elaborate - When a user sets num_vfs to a non-zero number, the PCI driver hook sriov_configure calls bnxt_sriov_configure(). Once pci_enable_sriov() succeeds, bnxt_ulp_sriov_cfg() is issued under bnxt_sriov_configure(). All this happens under bnxt_en. bnxt_ulp_sriov_cfg() ultimately calls into the bnxt_re driver. Since bnxt_sriov_configure() is called only for PFs, bnxt_ulp_sriov_cfg() is called for PFs only. Once bnxt_ulp_sriov_cfg() calls into the bnxt_re via the ulp_ops, it adjusts the QPs, SRQs, CQs, MRs, GIDs and such. > > > So we are trying to call the > > bnxt_re_sriov_config in the context of handling the PF's > > sriov_configure implementation. Having the ulp_ops is allowing us to > > avoid resource wastage and assumptions in the bnxt_re driver. > > To which resource wastage are you referring? Essentially the PF driver reserves a set of above resources for the PF, and divides the remaining resources among the VFs. If the calculation is based on sriov_totalvfs instead of sriov_numvfs, there can be a difference in the resources provisioned for a VF. And that is because a user may create a subset of VFs instead of the total VFs allowed in the PCI SR-IOV capability register. I was referring to the resource wastage in that deployment scenario. Thanks Ajit > > There are no differences if same limits will be in bnxt_en driver when > RDMA bnxt device is created or in bnxt_re which will be called once RDMA > device is created. > > Thanks > > > > > ::snip:: > >
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature