Re: [PATCH 4/4] PCI/DPC: Print AER status in DPC event handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Jan 13, 2018 at 06:35:18PM -0700, Keith Busch wrote:
> On Fri, Jan 12, 2018 at 05:03:03PM -0600, Bjorn Helgaas wrote:
> > Hi Keith,
> > 
> > On Tue, Dec 19, 2017 at 02:06:43PM -0700, Keith Busch wrote:
> > > A DPC enabled device will suppress sending ERR_FATAL and ERR_NONFATAL,
> > > which prevents the AER handler from reporting the details of the
> > > error. This patch will have the DPC driver get the AER status registers
> > > from the downstream port that detected the uncorrectable error, and
> > > print out additional information.
> > > 
> > > Signed-off-by: Keith Busch <keith.busch@xxxxxxxxx>
> > > ---
> > >  drivers/pci/pcie/pcie-dpc.c | 8 ++++++++
> > >  1 file changed, 8 insertions(+)
> > > 
> > > diff --git a/drivers/pci/pcie/pcie-dpc.c b/drivers/pci/pcie/pcie-dpc.c
> > > index ef71a472592c..aefc4fbdcef0 100644
> > > --- a/drivers/pci/pcie/pcie-dpc.c
> > > +++ b/drivers/pci/pcie/pcie-dpc.c
> > > @@ -44,6 +44,7 @@ struct dpc_dev {
> > >  	int			cap_pos;
> > >  	bool			rp;
> > >  	u32			rp_pio_status;
> > > +	struct aer_err_info	info;
> > >  };
> > >  
> > >  static const char * const rp_pio_error_string[] = {
> > > @@ -111,6 +112,9 @@ static void interrupt_event_handler(struct work_struct *work)
> > >  	struct pci_dev *dev, *temp, *pdev = dpc->dev->port;
> > >  	struct pci_bus *parent = pdev->subordinate;
> > >  
> > > +	if (aer_get_device_error_info(pdev, &dpc->info))
> > > +		aer_print_error(pdev, &dpc->info);
> > 
> > I'm confused about this.  "pdev" is the DPC-enabled Port, i.e., the
> > Port that received an ERR_FATAL or ERR_NONFATAL Message.  The Message
> > was generated by something below "pdev", and that's where the
> > interesting AER logging would have been done.
> > 
> > This patch suggests that the DPC port itself would have something
> > interesting logged in its AER capability.  Is that really true?  I see
> > in sec 6.2.10 (PCIe r4.0) that the DPC port should log the Trigger
> > Reason and the Error Source ID from the Message, but I don't see
> > anything about logging anything in its AER registers.
> 
> If the trigger reason is "unmasked uncorrectable error", then the DPC
> AER register has useful information. All other reasons would not provide
> useful data in the AER register.

OK, that makes sense.  This "unmasked uncorrectable error" case would
be things like the Port receiving an Unsupported Request completion, a
Completion Timeout, Completer Abort, etc.  In that case, the DPC Port
should do its own AER logging before triggering DPC.

> I would assume that aer_get_device_error_info would return false if
> it's one of other reasons, but I can fix this up to check the trigger
> reasoning rather than unconditionally reading the AER status

It might be overkill, but it does niggle at me that in these other
cases, the hardware isn't telling us to read the AER logging, but we
do it anyway.

We currently pass no information to interrupt_event_handler(), so it
has no way of testing the trigger reason.  I guess you could do the
test in dpc_irq() before scheduling interrupt_event_handler()?

That would move it from the work item to the ISR, but I guess it's
essentially the same sort of code as dpc_process_rp_pio_error(), which
we already do in the ISR.

Bjorn



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux