On Wednesday, February 23, 2011 12:21:19 pm Mauro Carvalho Chehab wrote: > Em 23-02-2011 14:14, Bjorn Helgaas escreveu: > > On Wednesday, February 23, 2011 03:08:10 am Jan Beulich wrote: > >> On various newer Intel systems the PCI bus(ses) the non-core devices > >> live on aren't getting announced by ACPI except through the bus range > >> covered by mmconfig. At least the i7core-edac driver depends on these > >> devices getting detected. > > > > I think you're saying: > > > > - the PCI host bridge has a _CRS method that reports the downstream > > bus number range > > - there are downstream devices on a bus X outside that _CRS range > > - the MCFG table has an entry that covers bus X > > > > What are these downstream devices? Might the BIOS be intentionally > > excluding them from PCI discovery so it can use them for its own > > purposes, e.g., things used by SMM code or exposed via an ACPI > > namespace device? > > > > If these devices are really intended for PCI discovery by the OS, > > the fact that the _CRS range excludes the bus sounds like a simple > > BIOS defect, and the workaround you propose feels like an ad hoc > > strategy that could be fragile. > > > > If the BIOS is intentionally hiding the devices by excluding the > > bus from the _CRS range, I think the BIOS could (and probably should) > > also exclude the bus from the ranges in the MCFG table, and then > > this fix would fail. > > AFAIK, some BIOS are intentionally hiding those devices. The devices that > i7core-edac driver need are the the non-core PCI devices where the memory > controllers are, on Nehalem/Nehalem-EP, like: > > 3f:00.0 Host bridge: Intel Corporation Xeon 5500/Core i7 QuickPath Architecture Generic Non-Core Registers (rev 05) ... (thanks for the specific example; that helps a lot) > In the specific machine I'm getting the above (a HP Z400 Workstation), BIOS > is not hiding those devices. But there are several reports of machines hiding > it (I think I have access to one of them, but I need to remember where). > > I'm not entirelly sure if using MCFG table will solve the issue, as BIOS > might also fill MCFG with a wrong info. > > As those PCI devices are inside the processor, there's probably some way > to read and/or change where they are mapped, as I saw the same processor > having those devices at address 0x3f and at address 0xff. There are good reasons why a BIOS might hide a PCI device. For example, if SMM code uses a PCI device, the BIOS must prevent the OS from moving it. One way would be to hide the device from PCI enumeration and then expose it via the ACPI namespace, where the _CRS/_PRS/_SRS methods allow the BIOS to control the configuration. I know there's some tension here -- things like EDAC want to use devices we "know" are there, while the BIOS might need to hide things to keep our mitts off them. I'm not sure there's a reliable way to tell when it's safe for us to go around the BIOS intent. Nobody wants to give up EDAC functionality, but in the long term, it might be better to say, "Well, your OEM doesn't want to support functionality X even though the hardware is there, so consider that when you're choosing your next system." Or maybe we taint the kernel when we circumvent the BIOS like this. At a minimum, I think we should log a note in dmesg. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html