Re: [PATCH] PCI: reset driver SR-IOV state after remove

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 28 Jun 2018 13:07:05 -0500, Bjorn Helgaas wrote:
> On Thu, Jun 28, 2018 at 09:17:09AM -0700, Jakub Kicinski wrote:
> > On Thu, Jun 28, 2018 at 6:56 AM, Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:  
> > > On Tue, Jun 26, 2018 at 09:40:02AM -0700, Jakub Kicinski wrote:  
> > >> On Mon, 18 Jun 2018 15:01:42 -0700, Jakub Kicinski wrote:  
> > >> > Bjorn points out that currently core and most of the drivers don't
> > >> > clean up dev->sriov->driver_max_VFs settings on .remove().  This
> > >> > means that if a different driver is bound afterwards it will
> > >> > inherit the old setting:
> > >> >
> > >> >   - load PF driver 1
> > >> >   - driver 1 calls pci_sriov_set_totalvfs() to reduce driver_max_VFs
> > >> >   - unload PF driver 1
> > >> >   - load PF driver 2
> > >> >   # driver 2 does not know to call pci_sriov_set_totalvfs()
> > >> >
> > >> > Some drivers (e.g. nfp) used to do the clean up by calling
> > >> > pci_sriov_set_totalvfs(dev, 0), since 0 was equivalent to no driver
> > >> > limit set.  After commit 8d85a7a4f2c9 ("PCI/IOV: Allow PF drivers
> > >> > to limit total_VFs to 0") 0 no longer has this special meaning.
> > >> >
> > >> > The need to reset driver_max_VFs comes from the fact that old FW
> > >> > builds may not expose its VF limits to the drivers, and depend on
> > >> > the ability to reject the configuration change when driver notifies
> > >> > the FW as part of struct pci_driver->sriov_configure() callback.
> > >> > Therefore if there is no information on VF count limits we should
> > >> > use the PCI capability max, and not the last value set by
> > >> > pci_sriov_set_totalvfs().
> > >> >
> > >> > Reset driver_max_VFs back to total_VFs after device remove.  If
> > >> > drivers want to reload FW/reconfigure the device while driver is
> > >> > bound we may add an explicit pci_sriov_reset_totalvfs(), but for
> > >> > now no driver is doing that.
> > >> >
> > >> > Fixes: 8d85a7a4f2c9 ("PCI/IOV: Allow PF drivers to limit total_VFs to 0")
> > >> > Signed-off-by: Jakub Kicinski <jakub.kicinski@xxxxxxxxxxxxx>  
> > >>
> > >> Any comments? :(  Could this still make it into 4.18?  
> > >
> > > Is this a fix for something we broke during the v4.18 merge window?
> > > Or does it fix something that's been broken for a long time?  I think
> > > the latter, but haven't looked carefully yet.  
> > 
> > This is half of a fix, really.  NFP driver does
> > pci_sriov_set_totalvfs(pdev, 0) to clear the limit.  Now it will
> > disable SR-IOV completely.  If this patch gets applied I will drop
> > those pci_sriov_set_totalvfs(pdev, 0) calls from the driver and the
> > regression will be solved.  
> 
> Sorry, I missed the regression part.  8d85a7a4f2c9 ("PCI/IOV: Allow PF
> drivers to limit total_VFs to 0") appeared in v4.18-rc1, so I guess the
> problem is that nfp works correctly in v4.17, but is broken in v4.18-rc1?
> 
> I'd just like to understand the breakage and fix better.  In v4.17:
> 
>   # nfp loaded:
>   nfp_pci_probe
>     nfp_pcie_sriov_read_nfd_limit
>       nfp_rtsym_read_le("nfd_vf_cfg_max_vfs", &err)
>       if (err)                              # FW doesn't expose limit (?)
> 	pci_sriov_set_totalvfs(0)
> 	  dev->sriov->driver_max_VFs = 0    # PCI_SRIOV_CTRL_VFE is clear
> 
>   # userspace writes N to sysfs "sriov_numvfs":
>   sriov_numvfs_store                    # write N
>     pci_sriov_get_totalvfs
>       return dev->sriov->total_VFs	# since driver_max_VFs == 0
>     driver->sriov_configure(N)
>       nfp_pcie_sriov_configure(N)
>         nfp_pcie_sriov_enable(N)
> 
> In v4.18-rc1, it's the same except pci_sriov_get_totalvfs() now returns 0:
> 
>   # nfp loaded:
>   nfp_pci_probe
>     nfp_pcie_sriov_read_nfd_limit
>       nfp_rtsym_read_le("nfd_vf_cfg_max_vfs", &err)
>       if (err)                              # FW doesn't expose limit (?)
> 	pci_sriov_set_totalvfs(0)
> 	  dev->sriov->driver_max_VFs = 0    # PCI_SRIOV_CTRL_VFE is clear
> 
>   # userspace writes N to sysfs "sriov_numvfs":
>   sriov_numvfs_store                    # write N
>     pci_sriov_get_totalvfs
>       return 0                          # (driver_max_VFs == 0)
>     return -ERANGE                      # because N > 0
> 
> Am I on the right track?  

That's exactly it!

> Sounds like we may want to merge this patch and the nfp patch to
> remove the pci_sriov_set_totalvfs() calls together to make sure they
> get merged in the correct order.

NFP patch coming right up!  For some reason I was planning to push it
via the net tree, but on reflection it doesn't matter..



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux