Re: [PATCH RESEND] PCI/DPC: Fix print AER status in DPC event handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Keith

Many thanks for your review.

在 2019/2/11 23:46, Keith Busch 写道:
On Mon, Feb 11, 2019 at 03:02:59PM +0800, Dongdong Liu wrote:
+static int dpc_get_aer_uncorrect_severity(struct pci_dev *dev,
+					  struct aer_err_info *info)
+{
+	int pos = dev->aer_cap;
+	u32 status, mask, sev;
+
+	pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, &status);
+	pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_MASK, &mask);
+	status &= ~mask;
+	if (!status)
+		return 0;
+
+	pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_SEVER, &sev);
+	status &= sev;
+	if (status)
+		info->severity = AER_FATAL;
+	else
+		info->severity = AER_NONFATAL;
+
+	return 1;
+}
+

You can set info->severity to AER_FATAL since that's the only type we
enable DPC triggering.

DPC Trigger Enable
01b-DPC is enabled and is triggered when the Downstream Port detects
an unmasked uncorrectable error or when the Downstream Port receives an
ERR_FATAL Message.

DPC Trigger Reason
00b-DPC was triggered due to an unmasked uncorrectable error
reason == 0, due to detect an unmasked uncorrectable error, include non-fatal
and fatal error, so need to get the severity.


 static irqreturn_t dpc_handler(int irq, void *context)
 {
 	struct aer_err_info info;
@@ -229,9 +251,12 @@ static irqreturn_t dpc_handler(int irq, void *context)
 	/* show RP PIO error detail information */
 	if (dpc->rp_extensions && reason == 3 && ext_reason == 0)
 		dpc_process_rp_pio_error(dpc);
-	else if (reason == 0 && aer_get_device_error_info(pdev, &info)) {
+	else if (reason == 0 &&
+		 dpc_get_aer_uncorrect_severity(pdev, &info) &&
+		 aer_get_device_error_info(pdev, &info)) {
 		aer_print_error(pdev, &info);
 		pci_cleanup_aer_uncorrect_error_status(pdev);
+		pci_aer_clear_fatal_status(pdev);

Good catch here, but let's clear the pending bits with a single call
to pci_cleanup_aer_error_status_regs() rather than NONFATAL and
FATAL separately.

pci_cleanup_aer_error_status_regs() also clear correctable error status.
seems not good enough as reason == 0 means detect an unmasked uncorrectable error.

Thanks,
Dongdong

.





[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux