- drivers-edac-pci-broken-parity-regression.patch removed from -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The patch titled
     drivers/edac: pci: broken parity regression
has been removed from the -mm tree.  Its filename was
     drivers-edac-pci-broken-parity-regression.patch

This patch was dropped because it was merged into mainline or a subsystem tree

The current -mm tree may be found at http://userweb.kernel.org/~akpm/mmotm/

------------------------------------------------------
Subject: drivers/edac: pci: broken parity regression
From: Bryan Boatright <b1@xxxxxxxxxxx>

Using the EDAC code in kernel.org kernel version 2.6.23.8 I am seeing the
following problem:

    In the kernel there is a pci device attribute located in sysfs that is
    checked by the EDAC PCI scanning code. If that attribute is set,
    PCI parity/error scannining is skipped for that device. The attribute
    is:

            broken_parity_status

    as is located in /sys/devices/pci<XXX>/0000:XX:YY.Z directorys for
    PCI devices.

I don't think this check was actually implemented.  I have a misbehaved card
that reports a parity error every 1000 ms:

Nov 25 07:28:43 beta kernel: EDAC PCI: Master Data Parity Error on 0000:05:01.0
Nov 25 07:28:44 beta kernel: EDAC PCI: Master Data Parity Error on 0000:05:01.0
Nov 25 07:28:45 beta kernel: EDAC PCI: Master Data Parity Error on 0000:05:01.0

Setting that card's broken_parity_status bit did not mask the error:

echo "1" > /sys/bus/pci/devices/0000:05:01.0/broken_parity_status

I looked through the EDAC code and did not readily see any reference to
broken_parity_status at all (which makes sense based on the behavior I am
seeing).  I applied the following patch as a proof-of-concept and now EDAC's
PCI parity error reporting behaves as documented:

bryan

Good regression find, bryan. It used to work. sigh.
I added more logic to your patch, for more coverage of the error.

Doug T

Signed-off-by: Bryan Boatright <b1@xxxxxxxxxxx>
Signed-off-by: Doug Thompson <dougthompson@xxxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

 drivers/edac/edac_pci_sysfs.c |   12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff -puN drivers/edac/edac_pci_sysfs.c~drivers-edac-pci-broken-parity-regression drivers/edac/edac_pci_sysfs.c
--- a/drivers/edac/edac_pci_sysfs.c~drivers-edac-pci-broken-parity-regression
+++ a/drivers/edac/edac_pci_sysfs.c
@@ -558,8 +558,10 @@ static void edac_pci_dev_parity_test(str
 
 	debugf4("PCI STATUS= 0x%04x %s\n", status, dev->dev.bus_id);
 
-	/* check the status reg for errors */
-	if (status) {
+	/* check the status reg for errors on boards NOT marked as broken
+	 * if broken, we cannot trust any of the status bits
+	 */
+	if (status && !dev->broken_parity_status) {
 		if (status & (PCI_STATUS_SIG_SYSTEM_ERROR)) {
 			edac_printk(KERN_CRIT, EDAC_PCI,
 				"Signaled System Error on %s\n",
@@ -593,8 +595,10 @@ static void edac_pci_dev_parity_test(str
 
 		debugf4("PCI SEC_STATUS= 0x%04x %s\n", status, dev->dev.bus_id);
 
-		/* check the secondary status reg for errors */
-		if (status) {
+		/* check the secondary status reg for errors,
+		 * on NOT broken boards
+		 */
+		if (status && !dev->broken_parity_status) {
 			if (status & (PCI_STATUS_SIG_SYSTEM_ERROR)) {
 				edac_printk(KERN_CRIT, EDAC_PCI, "Bridge "
 					"Signaled System Error on %s\n",
_

Patches currently in -mm which might be from b1@xxxxxxxxxxx are

origin.patch

-
To unsubscribe from this list: send the line "unsubscribe mm-commits" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Kernel Newbies FAQ]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Photo]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux