On Thu, Jun 28, 2018 at 11:14:57AM -0700, Jakub Kicinski wrote: > On Thu, 28 Jun 2018 13:07:05 -0500, Bjorn Helgaas wrote: > > On Thu, Jun 28, 2018 at 09:17:09AM -0700, Jakub Kicinski wrote: > > > On Thu, Jun 28, 2018 at 6:56 AM, Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > > > > On Tue, Jun 26, 2018 at 09:40:02AM -0700, Jakub Kicinski wrote: > > > >> On Mon, 18 Jun 2018 15:01:42 -0700, Jakub Kicinski wrote: > > > >> > Bjorn points out that currently core and most of the drivers don't > > > >> > clean up dev->sriov->driver_max_VFs settings on .remove(). This > > > >> > means that if a different driver is bound afterwards it will > > > >> > inherit the old setting: > > > >> > > > > >> > - load PF driver 1 > > > >> > - driver 1 calls pci_sriov_set_totalvfs() to reduce driver_max_VFs > > > >> > - unload PF driver 1 > > > >> > - load PF driver 2 > > > >> > # driver 2 does not know to call pci_sriov_set_totalvfs() > > > >> > > > > >> > Some drivers (e.g. nfp) used to do the clean up by calling > > > >> > pci_sriov_set_totalvfs(dev, 0), since 0 was equivalent to no driver > > > >> > limit set. After commit 8d85a7a4f2c9 ("PCI/IOV: Allow PF drivers > > > >> > to limit total_VFs to 0") 0 no longer has this special meaning. > > > >> > > > > >> > The need to reset driver_max_VFs comes from the fact that old FW > > > >> > builds may not expose its VF limits to the drivers, and depend on > > > >> > the ability to reject the configuration change when driver notifies > > > >> > the FW as part of struct pci_driver->sriov_configure() callback. > > > >> > Therefore if there is no information on VF count limits we should > > > >> > use the PCI capability max, and not the last value set by > > > >> > pci_sriov_set_totalvfs(). > > > >> > > > > >> > Reset driver_max_VFs back to total_VFs after device remove. If > > > >> > drivers want to reload FW/reconfigure the device while driver is > > > >> > bound we may add an explicit pci_sriov_reset_totalvfs(), but for > > > >> > now no driver is doing that. > > > >> > > > > >> > Fixes: 8d85a7a4f2c9 ("PCI/IOV: Allow PF drivers to limit total_VFs to 0") > > > >> > Signed-off-by: Jakub Kicinski <jakub.kicinski@xxxxxxxxxxxxx> > > > >> > > > >> Any comments? :( Could this still make it into 4.18? > > > > > > > > Is this a fix for something we broke during the v4.18 merge window? > > > > Or does it fix something that's been broken for a long time? I think > > > > the latter, but haven't looked carefully yet. > > > > > > This is half of a fix, really. NFP driver does > > > pci_sriov_set_totalvfs(pdev, 0) to clear the limit. Now it will > > > disable SR-IOV completely. If this patch gets applied I will drop > > > those pci_sriov_set_totalvfs(pdev, 0) calls from the driver and the > > > regression will be solved. > > > > Sorry, I missed the regression part. 8d85a7a4f2c9 ("PCI/IOV: Allow PF > > drivers to limit total_VFs to 0") appeared in v4.18-rc1, so I guess the > > problem is that nfp works correctly in v4.17, but is broken in v4.18-rc1? > > > > I'd just like to understand the breakage and fix better. In v4.17: > > > > # nfp loaded: > > nfp_pci_probe > > nfp_pcie_sriov_read_nfd_limit > > nfp_rtsym_read_le("nfd_vf_cfg_max_vfs", &err) > > if (err) # FW doesn't expose limit (?) > > pci_sriov_set_totalvfs(0) > > dev->sriov->driver_max_VFs = 0 # PCI_SRIOV_CTRL_VFE is clear > > > > # userspace writes N to sysfs "sriov_numvfs": > > sriov_numvfs_store # write N > > pci_sriov_get_totalvfs > > return dev->sriov->total_VFs # since driver_max_VFs == 0 > > driver->sriov_configure(N) > > nfp_pcie_sriov_configure(N) > > nfp_pcie_sriov_enable(N) > > > > In v4.18-rc1, it's the same except pci_sriov_get_totalvfs() now returns 0: > > > > # nfp loaded: > > nfp_pci_probe > > nfp_pcie_sriov_read_nfd_limit > > nfp_rtsym_read_le("nfd_vf_cfg_max_vfs", &err) > > if (err) # FW doesn't expose limit (?) > > pci_sriov_set_totalvfs(0) > > dev->sriov->driver_max_VFs = 0 # PCI_SRIOV_CTRL_VFE is clear > > > > # userspace writes N to sysfs "sriov_numvfs": > > sriov_numvfs_store # write N > > pci_sriov_get_totalvfs > > return 0 # (driver_max_VFs == 0) > > return -ERANGE # because N > 0 > > > > Am I on the right track? > > That's exactly it! OK. This regression happens even on the first module load; it doesn't require an unload and reload. If that's the case, *this* patch will not actually fix the regression, because it only changes things when we detach the driver. The *nfp* patch that removes the pci_sriov_set_totalvfs(0) calls will fix the regression because driver_max_VFs will be unchanged from its initial setting (TotalVFs from the SR-IOV capability). So if I'd figured this out earlier, we would have squashed the nfp patch into 8d85a7a4f2c9 and avoided the regression altogether. *This* patch fixes a legitimate unload/reload issue, but I'm not clear whether it's a regression anybody actually sees. If you see it, what scenario is it? Bjorn