Re: [PATCH 4/4] PCI/DPC: Print AER status in DPC event handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jan 12, 2018 at 05:03:03PM -0600, Bjorn Helgaas wrote:
> Hi Keith,
> 
> On Tue, Dec 19, 2017 at 02:06:43PM -0700, Keith Busch wrote:
> > A DPC enabled device will suppress sending ERR_FATAL and ERR_NONFATAL,
> > which prevents the AER handler from reporting the details of the
> > error. This patch will have the DPC driver get the AER status registers
> > from the downstream port that detected the uncorrectable error, and
> > print out additional information.
> > 
> > Signed-off-by: Keith Busch <keith.busch@xxxxxxxxx>
> > ---
> >  drivers/pci/pcie/pcie-dpc.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/drivers/pci/pcie/pcie-dpc.c b/drivers/pci/pcie/pcie-dpc.c
> > index ef71a472592c..aefc4fbdcef0 100644
> > --- a/drivers/pci/pcie/pcie-dpc.c
> > +++ b/drivers/pci/pcie/pcie-dpc.c
> > @@ -44,6 +44,7 @@ struct dpc_dev {
> >  	int			cap_pos;
> >  	bool			rp;
> >  	u32			rp_pio_status;
> > +	struct aer_err_info	info;
> >  };
> >  
> >  static const char * const rp_pio_error_string[] = {
> > @@ -111,6 +112,9 @@ static void interrupt_event_handler(struct work_struct *work)
> >  	struct pci_dev *dev, *temp, *pdev = dpc->dev->port;
> >  	struct pci_bus *parent = pdev->subordinate;
> >  
> > +	if (aer_get_device_error_info(pdev, &dpc->info))
> > +		aer_print_error(pdev, &dpc->info);
> 
> I'm confused about this.  "pdev" is the DPC-enabled Port, i.e., the
> Port that received an ERR_FATAL or ERR_NONFATAL Message.  The Message
> was generated by something below "pdev", and that's where the
> interesting AER logging would have been done.
> 
> This patch suggests that the DPC port itself would have something
> interesting logged in its AER capability.  Is that really true?  I see
> in sec 6.2.10 (PCIe r4.0) that the DPC port should log the Trigger
> Reason and the Error Source ID from the Message, but I don't see
> anything about logging anything in its AER registers.

If the trigger reason is "unmasked uncorrectable error", then the DPC
AER register has useful information. All other reasons would not provide
useful data in the AER register.


I would assume that aer_get_device_error_info would return false if
it's one of other reasons, but I can fix this up to check the trigger
reasoning rather than unconditionally reading the AER status



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux