Re: 3.1-rc4: spectacular kernel errors / filesystem crash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Sep 13, 2011 at 10:42 AM, Justin Piszcz <jpiszcz@xxxxxxxxxxxxxxx> wrote:
>
>
> On Tue, 13 Sep 2011, Jon Mason wrote:
>
>> On Tue, Sep 13, 2011 at 9:54 AM, Justin Piszcz <jpiszcz@xxxxxxxxxxxxxxx>
>> wrote:
>>>
>>>
>>> On Tue, 13 Sep 2011, Eric Dumazet wrote:
>>>
>>>> Please Justin make sure you pulled commit
>>>> commit ed2888e906b56769b4ffabb9c577190438aa68b8
>>>> Author: Jon Mason <mason@xxxxxxxx>
>>>> Date:   Thu Sep 8 16:41:18 2011 -0500
>>>>
>>>>   PCI: Remove MRRS modification from MPS setting code
>>>>
>>>>   Modifying the Maximum Read Request Size to 0 (value of 128Bytes) has
>>>>   massive negative ramifications on some devices.  Without knowing which
>>>>   devices have this issue, do not modify from the default value when
>>>>   walking the PCI-E bus in pcie_bus_safe mode.  Also, make pcie_bus_safe
>>>>   the default procedure.
>>>>
>>>>   Tested-by: Sven Schnelle <svens@xxxxxxxxxxxxxx>
>>>>   Tested-by: Simon Kirby <sim@xxxxxxxxxx>
>>>>   Tested-by: Stephen M. Cameron <scameron@xxxxxxxxxxxxxxxxxx>
>>>>   Reported-and-tested-by: Eric Dumazet <eric.dumazet@xxxxxxxxx>
>>>>   Reported-and-tested-by: Niels Ole Salscheider
>>>> <niels_ole@salscheider-online.
>>>>   References: https://bugzilla.kernel.org/show_bug.cgi?id=42162
>>>>   Signed-off-by: Jon Mason <mason@xxxxxxxx>
>>>>   Acked-by: Jesse Barnes <jbarnes@xxxxxxxxxxxxxxxx>
>>>>   Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
>>>
>>> Hello,
>>>
>>> I found this commit here:
>>> http://permalink.gmane.org/gmane.linux.kernel.pci/11700
>>
>> This is an early version of the patch.  This is the patch that you want:
>>
>> https://github.com/torvalds/linux/commit/ed2888e906b56769b4ffabb9c577190438aa68b8
>>
>> It appears that this patch didn't make it to lkml or linux-pci list
>> due to kernel.org DNS being down when it was sent.
>>
>> Thanks,
>> Jon
>
> I need to learn how to use git at some point, can you please provide plain
> text patches so I can apply them and reboot?
>
> Justin.

I've attached the 2 patches I asked Linus to include into 3.1-rc6.
Let me know if there are any issues.

Thanks,
Jon
From cf822aed99fd8851d82ae5f2df11c29b79e316c8 Mon Sep 17 00:00:00 2001
From: Shyam Iyer <shyam.iyer.t@xxxxxxxxx>
Date: Wed, 31 Aug 2011 12:21:42 -0400
Subject: [PATCH 1/2] Fix pointer dereference before call to
 pcie_bus_configure_settings

There is a potential NULL pointer dereference in calls to
pcie_bus_configure_settings due to attempts to access pci_bus self
variables when the self pointer is NULL.  To correct this, verify that
the self pointer in pci_bus is non-NULL before dereferencing it.

Reported-by: Stanislaw Gruszka <sgruszka@xxxxxxxxxx>
Signed-off-by: Shyam Iyer <shyam_iyer@xxxxxxxx>
Signed-off-by: Jon Mason <mason@xxxxxxxx>
---
 arch/x86/pci/acpi.c              |    9 +++++++--
 drivers/pci/hotplug/pcihp_slot.c |    4 +++-
 drivers/pci/probe.c              |    3 ---
 3 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/arch/x86/pci/acpi.c b/arch/x86/pci/acpi.c
index c953302..039d913 100644
--- a/arch/x86/pci/acpi.c
+++ b/arch/x86/pci/acpi.c
@@ -365,8 +365,13 @@ struct pci_bus * __devinit pci_acpi_scan_root(struct acpi_pci_root *root)
 	 */
 	if (bus) {
 		struct pci_bus *child;
-		list_for_each_entry(child, &bus->children, node)
-			pcie_bus_configure_settings(child, child->self->pcie_mpss);
+		list_for_each_entry(child, &bus->children, node) {
+			struct pci_dev *self = child->self;
+			if (!self)
+				continue;
+
+			pcie_bus_configure_settings(child, self->pcie_mpss);
+		}
 	}
 
 	if (!bus)
diff --git a/drivers/pci/hotplug/pcihp_slot.c b/drivers/pci/hotplug/pcihp_slot.c
index 753b21a..3ffd9c1 100644
--- a/drivers/pci/hotplug/pcihp_slot.c
+++ b/drivers/pci/hotplug/pcihp_slot.c
@@ -169,7 +169,9 @@ void pci_configure_slot(struct pci_dev *dev)
 			(dev->class >> 8) == PCI_CLASS_BRIDGE_PCI)))
 		return;
 
-	pcie_bus_configure_settings(dev->bus, dev->bus->self->pcie_mpss);
+	if (dev->bus && dev->bus->self)
+		pcie_bus_configure_settings(dev->bus,
+					    dev->bus->self->pcie_mpss);
 
 	memset(&hpp, 0, sizeof(hpp));
 	ret = pci_get_hp_params(dev, &hpp);
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 8473727..0820fc1 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1456,9 +1456,6 @@ void pcie_bus_configure_settings(struct pci_bus *bus, u8 mpss)
 {
 	u8 smpss = mpss;
 
-	if (!bus->self)
-		return;
-
 	if (!pci_is_pcie(bus->self))
 		return;
 
-- 
1.7.6

From 74d81235f8e4bd60859d539a27e51d3a09d183cf Mon Sep 17 00:00:00 2001
From: Jon Mason <mason@xxxxxxxx>
Date: Thu, 8 Sep 2011 12:59:00 -0500
Subject: [PATCH 2/2] PCI: Remove MRRS modification from MPS setting code

Modifying the Maximum Read Request Size to 0 (value of 128Bytes) has
massive negative ramifications on some devices.  Without knowing which
devices have this issue, do not modify from the default value when
walking the PCI-E bus in pcie_bus_safe mode.  Also, make pcie_bus_safe
the default procedure.

Tested-by: Sven Schnelle <svens@xxxxxxxxxxxxxx>
Tested-by: Simon Kirby <sim@xxxxxxxxxx>
Tested-by: Stephen M. Cameron <scameron@xxxxxxxxxxxxxxxxxx>
Reported-and-tested-by: Eric Dumazet <eric.dumazet@xxxxxxxxx>
Reported-and-tested-by: Niels Ole Salscheider <niels_ole@xxxxxxxxxxxxxxxxxxxxx>
References: https://bugzilla.kernel.org/show_bug.cgi?id=42162
Signed-off-by: Jon Mason <mason@xxxxxxxx>
---
 drivers/pci/pci.c   |    2 +-
 drivers/pci/probe.c |   41 ++++++++++++++++++++++-------------------
 2 files changed, 23 insertions(+), 20 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 0ce6742..4e84fd4 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -77,7 +77,7 @@ unsigned long pci_cardbus_mem_size = DEFAULT_CARDBUS_MEM_SIZE;
 unsigned long pci_hotplug_io_size  = DEFAULT_HOTPLUG_IO_SIZE;
 unsigned long pci_hotplug_mem_size = DEFAULT_HOTPLUG_MEM_SIZE;
 
-enum pcie_bus_config_types pcie_bus_config = PCIE_BUS_PERFORMANCE;
+enum pcie_bus_config_types pcie_bus_config = PCIE_BUS_SAFE;
 
 /*
  * The default CLS is used if arch didn't set CLS explicitly and not
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 0820fc1..b1187ff 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1396,34 +1396,37 @@ static void pcie_write_mps(struct pci_dev *dev, int mps)
 
 static void pcie_write_mrrs(struct pci_dev *dev, int mps)
 {
-	int rc, mrrs;
+	int rc, mrrs, dev_mpss;
 
-	if (pcie_bus_config == PCIE_BUS_PERFORMANCE) {
-		int dev_mpss = 128 << dev->pcie_mpss;
+	/* In the "safe" case, do not configure the MRRS.  There appear to be
+	 * issues with setting MRRS to 0 on a number of devices.
+	 */
 
-		/* For Max performance, the MRRS must be set to the largest
-		 * supported value.  However, it cannot be configured larger
-		 * than the MPS the device or the bus can support.  This assumes
-		 * that the largest MRRS available on the device cannot be
-		 * smaller than the device MPSS.
-		 */
-		mrrs = mps < dev_mpss ? mps : dev_mpss;
-	} else
-		/* In the "safe" case, configure the MRRS for fairness on the
-		 * bus by making all devices have the same size
-		 */
-		mrrs = mps;
+	if (pcie_bus_config != PCIE_BUS_PERFORMANCE)
+		return;
+
+	dev_mpss = 128 << dev->pcie_mpss;
 
+	/* For Max performance, the MRRS must be set to the largest supported
+	 * value.  However, it cannot be configured larger than the MPS the
+	 * device or the bus can support.  This assumes that the largest MRRS
+	 * available on the device cannot be smaller than the device MPSS.
+	 */
+	mrrs = min(mps, dev_mpss);
 
 	/* MRRS is a R/W register.  Invalid values can be written, but a
-	 * subsiquent read will verify if the value is acceptable or not.
+	 * subsequent read will verify if the value is acceptable or not.
 	 * If the MRRS value provided is not acceptable (e.g., too large),
 	 * shrink the value until it is acceptable to the HW.
  	 */
 	while (mrrs != pcie_get_readrq(dev) && mrrs >= 128) {
+		dev_warn(&dev->dev, "Attempting to modify the PCI-E MRRS value"
+			 " to %d.  If any issues are encountered, please try "
+			 "running with pci=pcie_bus_safe\n", mrrs);
 		rc = pcie_set_readrq(dev, mrrs);
 		if (rc)
-			dev_err(&dev->dev, "Failed attempting to set the MRRS\n");
+			dev_err(&dev->dev,
+				"Failed attempting to set the MRRS\n");
 
 		mrrs /= 2;
 	}
@@ -1436,13 +1439,13 @@ static int pcie_bus_configure_set(struct pci_dev *dev, void *data)
 	if (!pci_is_pcie(dev))
 		return 0;
 
-	dev_info(&dev->dev, "Dev MPS %d MPSS %d MRRS %d\n",
+	dev_dbg(&dev->dev, "Dev MPS %d MPSS %d MRRS %d\n",
 		 pcie_get_mps(dev), 128<<dev->pcie_mpss, pcie_get_readrq(dev));
 
 	pcie_write_mps(dev, mps);
 	pcie_write_mrrs(dev, mps);
 
-	dev_info(&dev->dev, "Dev MPS %d MPSS %d MRRS %d\n",
+	dev_dbg(&dev->dev, "Dev MPS %d MPSS %d MRRS %d\n",
 		 pcie_get_mps(dev), 128<<dev->pcie_mpss, pcie_get_readrq(dev));
 
 	return 0;
-- 
1.7.6

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs

[Index of Archives]     [Linux XFS Devel]     [Linux Filesystem Development]     [Filesystem Testing]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux