On Tue, Jun 04, 2013 at 04:15:21PM -0600, Bjorn Helgaas wrote: > Date: Tue, 4 Jun 2013 16:15:21 -0600 > From: Bjorn Helgaas <bhelgaas@xxxxxxxxxx> > To: Betty Dall <betty.dall@xxxxxx> > Cc: Chen Gong <gong.chen@xxxxxxxxxxxxxxx>, "Rafael J. Wysocki" > <rjw@xxxxxxx>, Huang Ying <ying.huang@xxxxxxxxx>, > "linux-acpi@xxxxxxxxxxxxxxx" <linux-acpi@xxxxxxxxxxxxxxx>, > "linux-kernel@xxxxxxxxxxxxxxx" <linux-kernel@xxxxxxxxxxxxxxx>, > "linux-pci@xxxxxxxxxxxxxxx" <linux-pci@xxxxxxxxxxxxxxx> > Subject: Re: [PATCH v2 3/3] PCI/AER: Provide reset_link for firmware first > root port > > On Tue, Jun 4, 2013 at 3:38 PM, Betty Dall <betty.dall@xxxxxx> wrote: > > On Tue, 2013-06-04 at 04:36 -0400, Chen Gong wrote: > >> On Thu, May 30, 2013 at 08:39:29AM -0600, Betty Dall wrote: > >> > Date: Thu, 30 May 2013 08:39:29 -0600 > >> > From: Betty Dall <betty.dall@xxxxxx> > >> > To: rjw@xxxxxxx, bhelgaas@xxxxxxxxxx > >> > Cc: ying.huang@xxxxxxxxx, linux-acpi@xxxxxxxxxxxxxxx, > >> > linux-kernel@xxxxxxxxxxxxxxx, linux-pci@xxxxxxxxxxxxxxx, Betty Dall > >> > <betty.dall@xxxxxx> > >> > Subject: [PATCH v2 3/3] PCI/AER: Provide reset_link for firmware first root > >> > port > >> > X-Mailer: git-send-email 1.7.7.6 > >> > > >> > The firmware first path does not install the aerdrv root port > >> > service driver, so the firmware first root port does not have > >> > a reset_link callback. When a firmware first root port does not have > >> > a reset_link callback, use a new default reset_link similar to what > >> > we already do for the default_downstream_reset_link(). The firmware > >> > first default reset_link brings the root port out of SBR if firmware > >> > left it in SBR. > >> > > >> > Changes since v1: > >> > Removed incorrect setting of p2p_ctrl after the read. > >> > > >> > Signed-off-by: Betty Dall <betty.dall@xxxxxx> > >> > --- > >> > > >> > drivers/pci/pcie/aer/aerdrv_core.c | 36 ++++++++++++++++++++++++++++++++++++ > >> > 1 files changed, 36 insertions(+), 0 deletions(-) > >> > > >> > > >> > diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c > >> > index 8ec8b4f..72f76cd 100644 > >> > --- a/drivers/pci/pcie/aer/aerdrv_core.c > >> > +++ b/drivers/pci/pcie/aer/aerdrv_core.c > >> > @@ -413,6 +413,39 @@ static pci_ers_result_t default_downstream_reset_link(struct pci_dev *dev) > >> > return PCI_ERS_RESULT_RECOVERED; > >> > } > >> > > >> > +/** > >> > + * default_ff_root_port_reset_link - default reset function for firmware > >> > + * first Root Port > >> > + * @dev: pointer to root port's pci_dev data structure > >> > + * > >> > + * Invoked when performing link reset at Root Port w/ no aer driver. > >> > + * This happens through the firmware first path. Firmware may leave > >> > + * the Root Port in SBR and it is the OS responsiblity to bring it out > >> > + * of SBR. > >> > + */ > >> > +static pci_ers_result_t default_ff_root_port_reset_link(struct pci_dev *dev) > >> > +{ > >> > + u16 p2p_ctrl; > >> > + > >> > + /* Read Secondary Bus Reset */ > >> > + pci_read_config_word(dev, PCI_BRIDGE_CONTROL, &p2p_ctrl); > >> > + > >> > + /* De-assert Secondary Bus Reset, if it is set */ > >> > + if (p2p_ctrl & PCI_BRIDGE_CTL_BUS_RESET) { > >> > + p2p_ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET; > >> > + pci_write_config_word(dev, PCI_BRIDGE_CONTROL, p2p_ctrl); > >> > + > >> > + /* > >> > + * System software must wait for at least 100ms from the end > >> > + * of a reset of one or more device before it is permitted > >> > + * to issue Configuration Requests to those devices. > >> > + */ > >> > + msleep(200); > >> > + dev_dbg(&dev->dev, "Root Port link has been reset\n"); > >> > + } > >> > + return PCI_ERS_RESULT_RECOVERED; > >> > +} > >> > >> I don't think this function is OK. > >> 1) You don't really reset the 2nd Bus but just checking its status. > >> I think you should have following steps to reset 2nd Bus: > >> > >> a. Assert 2nd Bus Reset > >> b. wait for some time until this message has been broadcasted well > >> c. De-assert 2nd Bus Reset > >> d. wait for Trst time > >> > >> IOW, since we have aer_do_secondary_bus_reset to perform secondary bus > >> reset, why you repeat it again? > >> > >> 2) msleep(200) is too long for kernel. You'd better yield the CPU when > >> sleep. > > > > The firmware first path currently has no reset_link. I want to make a > > minimal change since resetting the link could be considered firmware's > > job in the firmware first path. This change is to just check if SBR is > > set, and bring the link out of reset only if it is in SBR. This way, if > > a another firmware first platform is already resetting the link, it > > won't be done twice. I took the msleep and the code from > > aer_do_sercondary_bus_reset() as you noticed. > > I understand the desire to make a minimal change, and we certainly > don't want to make changes bigger than they need to be. > > However, my goal is really to have clean, maintainable code at the > end. If that means bigger patches where you can't test every affected > platform, that's a risk I'm willing to take. > > In this particular case, we could try to limit the change to just the > platforms you can test, as you've done. But it looks to me like we > actually want to reset the link in *all* cases, regardless of whether > firmware-first is involved. We're handling a fatal error for a > device. That device might be below a PCIe switch (where we currently > reset the link), or it might be directly below a root port (where we > currently don't). Why should those cases be different? I don't think > the device or the driver should be able to tell the difference. > > My *guess* is that this is an oversight in the original code, and that > doing the link reset for both downstream ports and root ports will > make error handling for devices below root ports work better. > > The *last* thing I want is code littered with special cases that are > only there because "they only fix the platforms I could actually > test." It's just about impossible to clean those up later. > > Bjorn I agree with Bjorn's conclusion. Once we meet a bogus BIOS implementation, it will be a disaster for that device.
Attachment:
signature.asc
Description: Digital signature