On Fri, Sep 28, 2018 at 09:42:20AM -0600, Keith Busch wrote: > On Thu, Sep 27, 2018 at 05:56:25PM -0500, Bjorn Helgaas wrote: > > On Wed, Sep 26, 2018 at 04:19:25PM -0600, Keith Busch wrote: > > > On Wed, Sep 26, 2018 at 05:01:16PM -0500, Bjorn Helgaas wrote: > > > > On Thu, Sep 20, 2018 at 10:27:13AM -0600, Keith Busch wrote: > > > > > The link reset always used the first bridge device, but AER broadcast > > > > > error handling may have reported an end device. This means the reset may > > > > > hit devices that were never notified of the impending error recovery. > > > > > > > > > > This patch uses the first downstream port in the hierarchy considered > > > > > reliable. An error detected by a switch upstream port should mean it > > > > > occurred on its upstream link, so the patch selects the parent device > > > > > if the error is not a root or downstream port. > > > > > > > > I'm not really clear on what "Always use the first downstream port" > > > > means. Always use it for *what*? > > > > And I forgot to ask what "first downstream port" means. > > The "first downstream port" was supposed to mean the first DSP we > find when walking toward the root from the pci_dev that reported > ERR_[NON]FATAL. > > By "use", I mean "walking down the sub-tree for error handling". > > After thinking more on this, that doesn't really capture the intent. A > better way might be: > > Run error handling starting from the downstream port of the bus > reporting an error I think the light is beginning to dawn. Does this make sense (as far as it goes)? PCI/ERR: Run error recovery callbacks for all affected devices If an Endpoint reports an error with ERR_FATAL, we previously ran driver error recovery callbacks only for the Endpoint. But if recovery requires that we reset the Endpoint, we may have to use a port farther upstream to initiate a Link reset, and that may affect components other than the Endpoint, e.g., multi-function peers and their children. Drivers for those devices were never notified of the impending reset. Call driver error recovery callbacks for every device that will be reset. Now help me understand this part: > This allows two other clean-ups. First, error handling can only run > on bridges so this patch removes checks for endpoints. "error handling can only run on bridges"? I *think* only Root Ports and Root Complex Event Collectors can assert AER interrupts, so aer_irq() is only run for them (actually I don't think we quite support Event Collectors yet). Is this what you're getting at? Probably not, because the "dev" passed to pcie_do_recovery() isn't an RP or RCEC. But the e_info->dev[i] that aer_process_err_devices() eventually passes in doesn't have to be a bridge at all, does it? So I can't quite figure out the bridge connection here. > Second, the first accessible port does not inherit the channel error > state since we can access it, so the special cases for error detect > and resume are no longer necessary. When we call pcie_do_recovery(), I guess we specify a "dev" that logged an AER event and a pci_channel_state. It seems a little bit weird to handle those separately, as opposed to incorporating into the pci_dev or something. But if you're just saying that "if A is frozen because of an error, that doesn't mean bridges upstream from A are frozen", that makes good sense. Bjorn