On Mon, Jul 22, 2013 at 4:15 PM, Bjorn Helgaas <bhelgaas@xxxxxxxxxx> wrote: > On Fri, Jul 19, 2013 at 1:14 PM, Yinghai Lu <yinghai@xxxxxxxxxx> wrote: >> After commit dc087f2f6a2925e81831f3016b9cbb6e470e7423 >> (PCI: Simplify IOV implementation and fix reference count races) >> VF need to be removed via virtfn_remove to make sure ref to PF >> is put back. >> >> Some driver (like ixgbe) does not call pci_disable_sriov() if >> sriov is enabled via /sys/.../sriov_numvfs setting. >> ixgbe does allow driver for PF get detached, but still have VFs >> around. >> >> But how about PF get removed via /sys or pciehp? >> >> During hot-remove, VF will still hold one ref to PF and it >> prevent PF to be removed. >> That make the next hot-add fails, as old PF dev struct is still around. >> >> We need to add pci_disable_sriov() calling during pci dev removing. >> >> Need this one for v3.11 > > Needs explanation. Pretend Linus is asking why we should put this in > after the merge window :) > > I think the answer is that dc087f2f introduced a regression in certain > hot-remove/hot-add scenarios, but an example transcript showing the > issue would help a lot. for intel 10GB ethernet, user could use /sys/../num_vfs to enable SRIOV after PF driver is loaded. If the cards in slots for pciehp, when user press button or use /sys/bus/pci/slots/../power to turn off the power. The PF driver will be stopped (but it does not call pci_disable_sriov), and try to remove the PF, then turn off the power. Actually the PF's pci_dev struct is not freed, because VFs still hold some reference to it. That is not fun, hotadd will not work, as old pci_dev struct is still there. those VF struct still have old ref to it. > >> Signed-off-by: Yinghai Lu <yinghai@xxxxxxxxxx> >> Cc: Jiang Liu <liuj97@xxxxxxxxx> >> Cc: Alexander Duyck <alexander.h.duyck@xxxxxxxxx> >> Cc: Donald Dutile <ddutile@xxxxxxxxxx> >> Cc: Greg Rose <gregory.v.rose@xxxxxxxxx> >> >> --- >> drivers/pci/remove.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> Index: linux-2.6/drivers/pci/remove.c >> =================================================================== >> --- linux-2.6.orig/drivers/pci/remove.c >> +++ linux-2.6/drivers/pci/remove.c >> @@ -34,6 +34,9 @@ static void pci_stop_dev(struct pci_dev >> >> static void pci_destroy_dev(struct pci_dev *dev) >> { >> + /* remove VF, if PF driver skip that */ >> + pci_disable_sriov(dev); > > How did you decide to call pci_disable_sriov() here rather than, for > example, in pci_stop_dev()? We already have some PME and ASPM cleanup > in pci_stop_dev(), and this seems sort of similar to those. yes, pci_stop_dev is better. I was thinking that pci_stop_dev could be used when PF's driver is unloaded or detached. Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html