Re: [PATCH]: aer-inject: Override PCIe AER Mask Registers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 01/10/2011 11:34 PM, Hidetoshi Seto wrote:
> Hi Prarit,

Hi Hidetoshi,

> 
> Which is your goal?:
>  1) Allow changing mask while test, to inject errors originally masked

#1 is my goal.  Unfortunately in some of the latest 
BIOSes, the masks are set such that ALL errors are 
masked off.  I should still be able to test out the
*software* though ...

>  2) Allow setting an error bit even if it is masked
> 
> Did you refer commit b49bfd32901625e4adcfee011d2b32a43b4db67d ?
> 

Yes, and I understand what that patch is attempting
to do.  However, we have some HW that seems to now have
these masks set erroneously.  Even then, testing the software
shouldn't be a problem.

FWIW, new patch.  dnelson pointed out that in the case of an
error the masks did not get reset properly.  Oops ...

--

I have several systems which have the same problem:  The PCIe AER corrected and
uncorrected masks have all the error bits set.  This results in the inablility
to test with the aer_inject module & utility on those systems.

Add the 'aer_mask_override' module parameter which will override the corrected
or uncorrected masks for a PCI device.  The mask will have the bit
corresponding to the status passed into the aer_inject() function.

After this patch it is possible to successfully use the aer_inject utility
on those PCI slots.

Successfully tested by me on a Dell and Intel whitebox which exhibited the
mask problem.

Signed-off-by: Prarit Bhargava <prarit@xxxxxxxxxx>

diff --git a/drivers/pci/pcie/aer/aer_inject.c b/drivers/pci/pcie/aer/aer_inject.c
index b3cf622..f62079f 100644
--- a/drivers/pci/pcie/aer/aer_inject.c
+++ b/drivers/pci/pcie/aer/aer_inject.c
@@ -27,6 +27,10 @@
 #include <linux/stddef.h>
 #include "aerdrv.h"
 
+/* Override the existing corrected and uncorrected error masks */
+static int aer_mask_override;
+module_param(aer_mask_override, bool, 0);
+
 struct aer_error_inj {
 	u8 bus;
 	u8 dev;
@@ -322,7 +326,7 @@ static int aer_inject(struct aer_error_inj *einj)
 	unsigned long flags;
 	unsigned int devfn = PCI_DEVFN(einj->dev, einj->fn);
 	int pos_cap_err, rp_pos_cap_err;
-	u32 sever, cor_mask, uncor_mask;
+	u32 sever, cor_mask, uncor_mask, cor_mask_orig, uncor_mask_orig;
 	int ret = 0;
 
 	dev = pci_get_domain_bus_and_slot((int)einj->domain, einj->bus, devfn);
@@ -361,6 +365,18 @@ static int aer_inject(struct aer_error_inj *einj)
 		goto out_put;
 	}
 
+	if (aer_mask_override) {
+		cor_mask_orig = cor_mask;
+		cor_mask &= !(einj->cor_status);
+		pci_write_config_dword(dev, pos_cap_err + PCI_ERR_COR_MASK,
+				       cor_mask);
+
+		uncor_mask_orig = uncor_mask;
+		uncor_mask &= !(einj->uncor_status);
+		pci_write_config_dword(dev, pos_cap_err + PCI_ERR_UNCOR_MASK,
+				       uncor_mask);
+	}
+
 	spin_lock_irqsave(&inject_lock, flags);
 
 	err = __find_aer_error_by_dev(dev);
@@ -378,14 +394,16 @@ static int aer_inject(struct aer_error_inj *einj)
 	err->header_log2 = einj->header_log2;
 	err->header_log3 = einj->header_log3;
 
-	if (einj->cor_status && !(einj->cor_status & ~cor_mask)) {
+	if (!aer_mask_override && einj->cor_status &&
+	    !(einj->cor_status & ~cor_mask)) {
 		ret = -EINVAL;
 		printk(KERN_WARNING "The correctable error(s) is masked "
 				"by device\n");
 		spin_unlock_irqrestore(&inject_lock, flags);
 		goto out_put;
 	}
-	if (einj->uncor_status && !(einj->uncor_status & ~uncor_mask)) {
+	if (!aer_mask_override && einj->uncor_status &&
+	    !(einj->uncor_status & ~uncor_mask)) {
 		ret = -EINVAL;
 		printk(KERN_WARNING "The uncorrectable error(s) is masked "
 				"by device\n");
@@ -425,6 +443,13 @@ static int aer_inject(struct aer_error_inj *einj)
 	}
 	spin_unlock_irqrestore(&inject_lock, flags);
 
+	if (aer_mask_override) {
+		pci_write_config_dword(dev, pos_cap_err + PCI_ERR_COR_MASK,
+				       cor_mask_orig);
+		pci_write_config_dword(dev, pos_cap_err + PCI_ERR_UNCOR_MASK,
+				       uncor_mask_orig);
+	}
+
 	ret = pci_bus_set_aer_ops(dev->bus);
 	if (ret)
 		goto out_put;
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux