RE: [ PATCH ] PCI-AER: Do not report successful error recovery for devices with AER-unaware drivers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Myron/Linas,

Thanks for pointing it out. I am still learning all the conventions. I will send out a newer version with the changes.

Thanks

Vijay

> -----Original Message-----
> From: Linas Vepstas [mailto:linasvepstas@xxxxxxxxx]
> Sent: Friday, November 09, 2012 7:28 PM
> To: Myron Stowe
> Cc: Pandarathil, Vijaymohan R; linux-pci@xxxxxxxxxxxxxxx
> Subject: Re: [ PATCH ] PCI-AER: Do not report successful error recovery for
> devices with AER-unaware drivers
> 
> Yes, that is correct: Reviewed-by/Acked-by: me. I plead late-night
> wooziness.
> 
> -- Linas
> 
> On 9 November 2012 11:06, Myron Stowe <myron.stowe@xxxxxxxxx> wrote:
> > Adding: linasvepstas@xxxxxxxxx
> >
> > On Thu, Nov 8, 2012 at 9:28 PM, Pandarathil, Vijaymohan R
> > <vijaymohan.pandarathil@xxxxxx> wrote:
> >> When an error is detected on a PCIe device which does not have an
> >> AER-aware driver, prevent AER infrastructure from reporting
> >> successful error recovery.
> >>
> >> This is because the report_error_detected() function that gets
> >> called in the first phase of recovery process allows forward
> >> progress even when the driver for the device does not have AER
> >> capabilities. It seems that all callbacks (in pci_error_handlers
> >> structure) registered by drivers that gets called during error
> >> recovery are not mandatory. So the intention of the infrastructure
> >> design seems to be to allow forward progress even when a specific
> >> callback has not been registered by a driver. However, if error
> >> handler structure itself has not been registered, it doesn't make
> >> sense to allow forward progress.
> >>
> >> As a result of the current design, in the case of a single device
> >> having an AER-unaware driver or in the case of any function in a
> >> multi-function card having an AER-unaware driver, a successful
> >> recovery is reported.
> >>
> >> Typical scenario this happens is when a PCI device is detached
> >> from a KVM host and the pci-stub driver on the host claims the
> >> device. The pci-stub driver does not have error handling capabilities
> >> but the AER infrastructure still reports that the device recovered
> >> successfully.
> >>
> >> The changes proposed here leaves the device in an unrecovered state
> >> if the driver for the device or for any function in a multi-function
> >> card does not have error handler structure registered. This reflects
> >> the true state of the device and prevents any partial recovery (or no
> >> recovery at all) reported as successful.
> >>
> >> Please also see comments from Linas Vepstas at the following link
> >> http://www.spinics.net/lists/linux-pci/msg18572.html
> >>
> >> Signed-off-by: Linas Vepstas <linasvepstas <at> gmail.com>
> >> Signed-off-by: Myron Stowe <mstowe <at> redhat.com>
> >> Signed-off-by: Vijay Mohan Pandarathil <vijaymohan.pandarathil <at>
> hp.com>
> >
> > Hay Vijay:
> >
> > I think Linas may have steered you incorrectly with his "FWIW, I guess
> > I could add a
> > Signed-off-by: Linas Vepstas <...>" comment.
> >
> > From ./Documentation/SubmittingPatches, "The Signed-off-by: tag
> > indicates that the
> > signer was involved in the development of the patch, or that he/she
> > was in the patch's
> > delivery path."
> >
> > I this case I suspect that "reviewed-by" or 'acked-by" would be more
> > appropriate for
> > both Linas (unless of course Linas worked with you in developing the
> patch) and
> > myslef.
> >
> > Also, when posting to the list it's considered good etiquette to
> > include (cc) everyone
> > cited in the Signed-off, reviewed-by, acked-by, ..., lines (and
> > possibly the linux-kernel
> > list - again see ./Documentation/SubmittingPatches).
> >
> > Myron
> >>
> >> ---
> >>
> >> diff --git a/drivers/pci/pcie/aer/aerdrv_core.c
> b/drivers/pci/pcie/aer/aerdrv_core.c
> >> index 06bad96..030b229 100644
> >> --- a/drivers/pci/pcie/aer/aerdrv_core.c
> >> +++ b/drivers/pci/pcie/aer/aerdrv_core.c
> >> @@ -215,6 +215,12 @@ static int report_error_detected(struct pci_dev
> *dev, void *data)
> >>
> >>         dev->error_state = result_data->state;
> >>
> >> +       if ((!dev->driver || !dev->driver->err_handler) &&
> >> +               !(dev->hdr_type & PCI_HEADER_TYPE_BRIDGE)) {
> >> +               dev_info(&dev->dev, "AER: Error detected but no driver
> has claimed this device or the driver is AER-unaware\n");
> >> +               result_data->result = PCI_ERS_RESULT_NONE;
> >> +               return 1;
> >> +       }
> >>         if (!dev->driver ||
> >>                 !dev->driver->err_handler ||
> >>                 !dev->driver->err_handler->error_detected) {
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> >> the body of a message to majordomo@xxxxxxxxxxxxxxx
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
��.n��������+%������w��{.n�����{���"�)��jg��������ݢj����G�������j:+v���w�m������w�������h�����٥



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux