Hi Bjorn, > -----Original Message----- > From: Bjorn Helgaas [mailto:helgaas@xxxxxxxxxx] > Sent: Wednesday, March 29, 2017 4:55 PM > To: Zytaruk, Kelly > Cc: linux-pci@xxxxxxxxxxxxxxx; Alex Williamson > Subject: Re: Having problems resetting a PCI device > > Hi Kelly, > > On Wed, Mar 29, 2017 at 08:03:33PM +0000, Zytaruk, Kelly wrote: > > I have a PCI device that is sitting behind a bridge. > > > > Under certain reproducible circumstances the PCI device will become > > inactive. Reading the PCI config space returns all 0xFFFFFFFF. > > > > The bridge appears to still be functional. Reading the status from the > > bridge I see a Fatal Error due to a Surprise Down event. > > Just to be specific, is this the "Surprise Down Error" in the AER uncorrectable > error status register? "lspci -vv" probably decodes all that for you. Yes, it shows up in the bridge device PCI config space. > > > I am trying to figure out how to bring the device back online. > > > > I tried toggling the secondary bus reset bit of the Bridge Control > > Register but it doesn't appear to make any difference. I still see > > 0xFFFFFFFF in the device config space. > > Are you calling pci_reset_function() or doing this by hand? > pci_reset_function() tries several different strategies, one of which is toggling > the secondary bus reset bit. I am doing it by hand. I just found the pci_reset_function about 5 minutes ago as I was scanning through pci.c for any clues. > > > I provided a pci_error_handler but the error_detected() function is > > not getting called. > > Do you have CONFIG_PCIEAER turned on? I would naively expect AER to log > something and call your error_detected() function if this error occurs (but I CONFIG_PCIEAER=y but error_detected() not getting called. I also noticed that CONFIG_HOTPLUG_PCI_PCIE=y. How do I trigger a hot unplug, hot plug? Maybe that might bring it back? > haven't looked at the code for a long time). > > > Given that these two methods are not helping me out what other choices > > do I have to either reset the PCI device or hot-plug the device from a > > kernel driver. Or some other method of bring the device back to life. > > > > Note that I am running Linux 4.8 in dom0 on Xen (if that makes a > > difference). > > > > Thanks, Kelly