Re: [PATCH v2 3/3] PCI: Stop sriov after stop PF if PF's driver skip that

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 08/01/2013 04:16 PM, Bjorn Helgaas wrote:
On Mon, Jul 29, 2013 at 2:32 PM, Don Dutile<ddutile@xxxxxxxxxx>  wrote:
On 07/29/2013 03:58 PM, Bjorn Helgaas wrote:

On Sat, Jul 27, 2013 at 9:11 AM, Yinghai Lu<yinghai@xxxxxxxxxx>   wrote:

After commit dc087f2f6a2925e81831f3016b9cbb6e470e7423
(PCI: Simplify IOV implementation and fix reference count races)
VF need to be removed via virtfn_remove to make sure ref to PF
is put back.

OK, I'm lost here.  I think this is the scenario where Yinghai saw a
regression (please correct me if not):

   00:03.0 Root port to [bus 02]
   02:00.0 SR-IOV NIC (PF in slot 2)
   02:00.1 VF (for example)

   # echo -n 0>  /sys/bus/pci/slots/2/power
   02:00.0 PF is powered off
   02:00.0 PF pci_dev is released, but VF 02:00.1 pci_dev still exists
and holds a reference to the PF pci_dev, so the 02:00.0 pci_dev is not
actually deallocated

   # echo -n 1>  /sys/bus/pci/slots/2/power
   pciehp 0000:00:03.0:pcie04: Device 0000:02:00.0 already exists at
0000:02:00, cannot hot-add

Prior to dc087f2f6 ("Simplify IOV implementation and fix reference
count races"), this scenario (powering the slot off then back on)
apparently worked, and hot-adding 02:00.0 worked fine.

But what about the VF pci_devs?  Prior to dc087f2f6, I assume they
still existed even after we removed power from the PF.  But obviously
the hardware VFs are disabled when we power the slot back up.  It
doesn't make sense to me to have pci_devs for these VFs that are no
longer enabled, so maybe I'm missing something here.

Some driver (like ixgbe) does not call pci_disable_sriov() if
sriov is enabled via /sys/.../sriov_numvfs setting.
ixgbe does allow driver for PF get detached, but still have VFs
around.

Is this something ixgbe should be doing differently?

I'm not 100% sold on the idea of the VFs staying active after the
driver releases the PF.  It seems asymmetric because I think the
driver has to claim the PF to *enable* the VFs, but we don't disable
them when releasing the PF.

What's the use case for detaching the PF driver while the VFs are active?

VF's assigned to (KVM) guest (via device-assignment).
Virtually, it's as if the enet cable is unplugged to the VF in the guest --
the device is still there (the PF wasn't unplugged, just the driver
de-configured).

OK, but why is it necessary to detach and reattach the PF driver at
all?  Are you trying to update the PF driver or something?  Change the
number of VFs?

One case is to update the PF driver; the other is 'operator error' :-/
Want to not crash the guest w/VFs, and allow operator to 'mend their error',
i.e., re-load the PF driver back.

Pre-sysfs-based configuration, the std way to configure the VFs into
a system was to unload the PF driver, and reload it with a vf-enabling
parameter
(like max_vfs=<n>  in the case of ixgbe, igb).

Yes.  But I assume the first time the PF was loaded, there were no VFs
enabled.  So disabling VFs at unload wouldn't cause any problem there.
  Then you reload the driver with "max_vfs=<n>".  The driver enables
VFs.  Is there any reason to unload the PF driver again?

Now, if someone unloaded the PF driver in the host, the unplanned removal
of the PF enabled the VF to crash the host (maybe AlexD can provide the
details how that occurred).
So, the solution was to 'pause' the VF operation and let packets drop
in the guest, and re-enable the VF operation if the PF driver was
re-configured.

I don't get this.  Why should we tolerate "unplanned removal of the
PF"?  If you yank the card, I would *expect* anything using it to
crash.

Last time I checked, PCIe handles surprise removal, electrically,
on unplanned removal -- it's capacitively coupled so crashes due to
odd signalling transition can be handled (malformed pcie packet, w/c,
which aer handles/dismisses).

Note, the i(x)gb(e) code was designed to handle operator error around
PF driver removal.  hot plug/unplug was probably not tested against
the 'save the VF state' code in the i(x)gb(e)vf drivers.
Again, try asking the driver owner(s).
Bjorn

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux