On Wed, May 16, 2018 at 08:28:39PM +0530, poza@xxxxxxxxxxxxxx wrote: > On 2018-05-16 18:34, Bjorn Helgaas wrote: > > On Wed, May 16, 2018 at 05:45:58PM +0530, poza@xxxxxxxxxxxxxx wrote: > > > On 2018-05-16 16:22, Bjorn Helgaas wrote: > > > > On Wed, May 16, 2018 at 01:46:25PM +0530, poza@xxxxxxxxxxxxxx wrote: > > > > > I am sorry I pasted the wrong snippet. > > > > > following needs to be fixed in v17. > > > > > from: > > > > > if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) { > > > > > /* > > > > > * If the error is reported by a bridge, we think > > > > > this error > > > > > * is related to the downstream link of the bridge, > > > > > so we > > > > > * do error recovery on all subordinates of the bridge > > > > > instead > > > > > * of the bridge and clear the error status of the > > > > > bridge. > > > > > */ > > > > > pci_walk_bus(dev->subordinate, report_resume, > > > > > &result_data); > > > > > pci_cleanup_aer_uncorrect_error_status(dev); > > > > > } > > > > > > > > > > > > > > > to > > > > > > > > > > if (service==AER && dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) { > > > > > /* > > > > > * If the error is reported by a bridge, we think > > > > > this error > > > > > * is related to the downstream link of the bridge, > > > > > so we > > > > > * do error recovery on all subordinates of the bridge > > > > > instead > > > > > * of the bridge and clear the error status of the > > > > > bridge. > > > > > */ > > > > > pci_walk_bus(dev->subordinate, report_resume, > > > > > &result_data); > > > > > pci_cleanup_aer_uncorrect_error_status(dev); > > > > > } > > > > > > > > > > this is only needed in case of AER. > > > > > > > > Oh, I missed this before. It makes sense to clear the AER status > > > > here, but why do we need to call report_resume()? We just called all > > > > the driver .remove() methods and detached the drivers from the > > > > devices. So I don't think report_resume() will do anything > > > > ("dev->driver" should be NULL) except set the dev->error_state to > > > > pci_channel_io_normal. We probably don't need that because we didn't > > > > change error_state in this fatal error path. > > > > > > if you remember, the path ends up calling > > > aer_error_resume > > > > > > the existing code ends up calling aer_error_resume as follows. > > > > > > do_recovery(pci_dev) > > > broadcast_error_message(..., error_detected, ...) > > > if (AER_FATAL) > > > reset_link(pci_dev) > > > udev = BRIDGE ? pci_dev : pci_dev->bus->self > > > driver->reset_link(udev) > > > aer_root_reset(udev) > > > if (CAN_RECOVER) > > > broadcast_error_message(..., mmio_enabled, ...) > > > if (NEED_RESET) > > > broadcast_error_message(..., slot_reset, ...) > > > broadcast_error_message(dev, ..., report_resume, ...) > > > if (BRIDGE) > > > report_resume > > > driver->resume > > > pcie_portdrv_err_resume > > > device_for_each_child(..., resume_iter) > > > resume_iter > > > driver->error_resume > > > aer_error_resume > > > pci_cleanup_aer_uncorrect_error_status(pci_dev) # only > > > if > > > BRIDGE > > > pci_write_config_dword(PCI_ERR_UNCOR_STATUS) > > > > > > hence I think we have to call it in order to clear the root port > > > PCI_ERR_UNCOR_STATUS and PCI_EXP_DEVSTA. > > > makes sense ? > > > > I know I sent you the call graph above, so you would think I might > > understand it, but you would be mistaken :) This still doesn't make > > sense to me. > > > > I think your point is that we need to call aer_error_resume(). That > > is the aerdriver.error_resume() method. The AER driver only binds to > > root ports. > > > > This path: > > > > pcie_do_fatal_recovery > > pci_walk_bus(dev->subordinate, report_resume, &result_data) > > > > calls report_resume() for every device on the dev->subordinate bus > > (and for anything below those devices). There are no root ports on > > that dev->subordinate bus, because root ports are always on a root > > bus, never on a subordinate bus. > > > > So I don't see how report_resume() can ever get to aer_error_resume(). > > Can you instrument that path and verify that it actually does get > > there somehow? > > you are right....the call > pci_walk_bus(dev->subordinate, report_resume, &result_data); > does not call aer_error_resume() > > but > pci_walk_bus(udev->bus, report_resume, &result_data); > does call aer_error_resume() > > now if you look at the comment of the function: > /** > * aer_error_resume - clean up corresponding error status bits > * @dev: pointer to Root Port's pci_dev data structure > * > * Invoked by Port Bus driver during nonfatal recovery. > */ > > it is invoked during nonfatal recovery. > but the code handles both fatal and nonfatal clearing of error bits. > > if (dev->error_state == pci_channel_io_normal) > status &= ~mask; /* Clear corresponding nonfatal bits */ > else > status &= mask; /* Clear corresponding fatal bits */ > pci_write_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, status); > > > so the question is, should we not call aer_error_resume during fatal > recovery ? > so that it clears the root port status, if of course the error is triggered > by AER running agent (RP, Switch) I'm sure we *should* clear AER status bits somewhere during ERR_FATAL recovery. As far as I can tell, the current code (before your patches) never calls aer_error_resume(). That might be a bug, but even if it is, it's something that should be fixed separately from *this* series. I think in this series, you should probably adjust the patch that adds do_fatal_recovery() so it doesn't call pci_walk_bus(..., report_resume). I don't think that does anything useful anyway, and that patch already changes AER so it doesn't call the pci_error_handlers callbacks (except .resume()). I think it would be cleaner to remove the ERR_FATAL use of .resume() at the same time you remove the others.