On Thu, 2013-02-14 at 20:53 -0700, Alex Williamson wrote: > On Thu, 2013-02-14 at 16:47 -0700, Bjorn Helgaas wrote: > > On Thu, Feb 14, 2013 at 11:37 AM, Alex Williamson > > <alex.williamson@xxxxxxxxxx> wrote: > > > A bus reset can trigger a presence detection change and result in a > > > suprise hotplug. This is generally not what we want to happen when > > > trying to reset a device. Disable the presence detection control on > > > on bridges around bus reset. > > > > This is a really interesting situation, and I'm not quite ready to > > sign up to the idea that this is really a problem and that if it is, > > this is the way we want to fix it. > > > > What would happen if we *did* handle this as a hotplug event, with a > > removal followed by an add? > > > > The scheme where pci_reset_function() does "pci_save_state(dev); > > pci_dev_reset(dev); pci_restore_state(dev);" makes me nervous. > > > > We're saving and restoring some of PCI config space around the reset, > > but there's no guarantee that we're preserving *all* the important > > state in config space because I think devices can have non-architected > > device-specific things in config space that we don't know how to > > save/restore. > > > > Devices also have internal state not exposed via config space. That > > state is lost during the reset but can't be restored by > > pci_restore_state(). So it seems like pci_reset_function() is > > pretending to do something it can't really do reliably. > > > > If we make it so a reset is always handled as a remove+add, then we'll > > use a more generic path, and we'll get all the stuff you expect when > > initializing a new device -- resource assignment, IRQ setup, quirks, > > etc. Quirks in particular seem like something we want, but don't > > currently get with pci_reset_function(). > > > > Oh, and the "disable presence detect" approach below only works for > > things below a PCIe bridge with native hotplug, right? I wonder what > > happens if we reset devices below a bridge using SHPC or acpiphp. > > Triggering a remove+add is not useful for the way we use it today. The > users I'm aware of are KVM device assignment and VFIO, where we trigger > it in an attempt to get the device to a known state so that we have some > hope of repeatability. In those scenarios the reset is initiated by the > driver. The interface isn't meant to guarantee the device is returned > to an identical state as it was before reset. If it did, why would we > call it? We want to get to a state as near to power on, but still with > config programming, as we can. > > Being driver directed, having the reset initiate a remove is pretty near > the last thing we want. That limits the scope of calling it to only > when the driver can readily release the device. If we have the device > attached to a guest or userspace driver, that's potentially a lot of > setup and teardown and effectively extending a surprise removal all the > way up the stack. > > Obviously a bus reset is a big hammer and we do exhaust all the little > hammers of flr and pm reset before we try it, but in this case, we know > the device that's going away and with all likelihood, it's coming right > back at the same location. If we take the path of forcing a remove+add, > let's just remove it from the reset_function call path and we'll do > without reset for those devices. Thanks, Time to revisit this bug. Clearly when a driver or userspace calls pci_reset_function the intention is not to have the device be hot-unplugged and re-plugged. So I think we either need to prevent that from happening or politely decline the reset. I don't really know how to do this on acpiphp or shpc or whatever other hotplug controllers we support. So, what if we add a reset_slot callback to hotplug_slot_ops? We could then make pci_parent_bus_reset do something like: if (dev->slot && dev->slot->hotplug_slot) { if (!dev->slot->hotplug_slot->reset_slot) return -ENOTTY; return dev->slot->hotplug_slot->reset_slot(dev->slot->hotplug_slot); } else { ... standard secondary bus reset... } I'd actually also like to add a pci_reset_bus interface because we do have cases where the pci_reset_function is not sufficient (device doesn't do any useful reset of it's own and pci_parent_bus_reset won't because there are other devices on the bus). Graphics cards in particular are biting us here. When all of the devices on the bus are owned by a driver, this would provide a less device dependent reset. It would use same logic and code as enabled with reset_slot. Thoughts? Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html