On Wed, Oct 5, 2011 at 11:37 AM, Bjorn Helgaas <bhelgaas@xxxxxxxxxx> wrote: > On Wed, Oct 5, 2011 at 10:17 AM, Jon Mason <mason@xxxxxxxx> wrote: >> Intel 5000 and 5100 series memory controllers have a known issue if read >> completion coalescing is enabled and the PCI-E Maximum Payload Size is >> set to 256B. To work around this issue, disable read completion >> coalescing in the memory controller and root complexes. Unfortunately, >> it must always be disabled, even if no 256B MPS devices are precent, due > > "present" > >> to the possiblity of one being hotplugged. > > "possibility" One of these days I'll learn to always run spell check on these. Thanks. > >> Links to erratas: >> http://www.intel.com/content/dam/doc/specification-update/5000-chipset-memory-controller-hub-specification-update.pdf >> http://www.intel.com/content/dam/doc/specification-update/5100-memory-controller-hub-chipset-specification-update.pdf >> >> Thanks to Jesse Brandeburg and Ben Hutchings for providing insight into >> the problem. >> >> Tested-and-Reported-by: Avi Kivity <avi@xxxxxxxxxx> >> Signed-off-by: Jon Mason <mason@xxxxxxxx> >> --- >> drivers/pci/quirks.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++++++ >> 1 files changed, 69 insertions(+), 0 deletions(-) >> >> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c >> index 1196f61..70da3f5 100644 >> --- a/drivers/pci/quirks.c >> +++ b/drivers/pci/quirks.c >> @@ -2822,6 +2822,75 @@ static void __devinit fixup_ti816x_class(struct pci_dev* dev) >> } >> DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_TI, 0xb800, fixup_ti816x_class); >> >> +/* Intel 5000 and 5100 Memory controllers have an errata with read completion >> + * coalescing (which is enabled by default on some BIOSes) and MPS of 256B. >> + * Since there is no way of knowing what the PCIE MPS on each fabric will be >> + * until all of the devices are discovered and buses walked, read completion >> + * coalescing must be disabled. Unfortunately, it cannot be re-enabled because >> + * it is possible to hotplug a device with MPS of 256B. >> + */ >> +static void __devinit quirk_intel_mc_errata(struct pci_dev *dev) >> +{ >> + int err; >> + u16 rcc; >> + >> + if (pcie_bus_config == PCIE_BUS_TUNE_OFF) >> + return; >> + >> + /* Intel errata specifies bits to change but does not say what they are. >> + * Keeping them magical until such time as the registers and values can >> + * be explained. >> + */ >> + err = pci_read_config_word(dev, 0x48, &rcc); >> + if (err) { >> + dev_err(&dev->dev, "Error attempting to read the read " >> + "completion coalescing register.\n"); >> + return; >> + } >> + >> + if (!(rcc & (1 << 10))) >> + return; >> + >> + rcc &= ~(1 << 10); >> + >> + err = pci_write_config_word(dev, 0x48, rcc); >> + if (err) { >> + dev_err(&dev->dev, "Error attempting to write the read " >> + "completion coalescing register.\n"); >> + return; >> + } >> + >> + pr_info_once("Read completion coalescing disabled due to hardware " >> + "errata relating to 256B MPS.\n"); > > Can you use dev_info() here so we have the context that (a) this is > PCI-related and (b) specifically related to device X? It would print out 6 times for each system (one for the memory controller and one for each root port). I thought printing it once was better, since it is really a system wide setting. > >> +} >> +/* Intel 5000 series memory controllers and ports 2-7 */ >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25c0, quirk_intel_mc_errata); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25d0, quirk_intel_mc_errata); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25d4, quirk_intel_mc_errata); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25d8, quirk_intel_mc_errata); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25e2, quirk_intel_mc_errata); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25e3, quirk_intel_mc_errata); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25e4, quirk_intel_mc_errata); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25e5, quirk_intel_mc_errata); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25e6, quirk_intel_mc_errata); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25e7, quirk_intel_mc_errata); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25f7, quirk_intel_mc_errata); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25f8, quirk_intel_mc_errata); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25f9, quirk_intel_mc_errata); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25fa, quirk_intel_mc_errata); >> +/* Intel 5100 series memory controllers and ports 2-7 */ >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x65c0, quirk_intel_mc_errata); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x65e2, quirk_intel_mc_errata); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x65e3, quirk_intel_mc_errata); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x65e4, quirk_intel_mc_errata); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x65e5, quirk_intel_mc_errata); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x65e6, quirk_intel_mc_errata); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x65e7, quirk_intel_mc_errata); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x65f7, quirk_intel_mc_errata); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x65f8, quirk_intel_mc_errata); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x65f9, quirk_intel_mc_errata); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x65fa, quirk_intel_mc_errata); > > Wow, that's a long list. Makes me suspect that we'll have to keep > adding things to it, but I guess there's no good way to do it > generically. I looked at a number of other Intel MC erratas, and didn't see any MPS related issues. Hopefully this is it for these, though who knows if there are similar issues on AMD/nVidia memory memory controllers. Guess there is one way to find out. :) Should I push this for 3.1-rcX or wait for 3.2? > > Bjorn > -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html