On Mon, 11 Feb 2013 21:15:56 -0700 Alex Williamson <alex.williamson@xxxxxxxxxx> wrote: > On Wed, 2013-02-06 at 08:58 -0700, Alex Williamson wrote: > > On Wed, 2013-02-06 at 07:49 -0800, Stephen Hemminger wrote: > > > On Mon, 04 Feb 2013 15:41:24 -0700 > > > Alex Williamson <alex.williamson@xxxxxxxxxx> wrote: > > > > > > > On Mon, 2013-02-04 at 13:28 -0700, Alex Williamson wrote: > > > > > On Mon, 2013-02-04 at 10:36 -0800, Stephen Hemminger wrote: > > > > > > > I think drivers/pci/search.c is identical between 3.7 and 3.8-rc1. Is > > > > > > > this the first time you've turned on the IOMMU on that box? > > > > > > > > > > > > It exists in 3.7 and earlier kernels, just haven't turned on same config. > > > > > > > > > > > > > It's the same warning as in this bugzilla: > > > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=44881, and there's a patch > > > > > > > there at https://bugzilla.kernel.org/show_bug.cgi?id=44881#c11, but > > > > > > > it's just a quirk that turns off VT-d if we find certain broken > > > > > > > bridges. It doesn't look like you have any of those (although I don't > > > > > > > know what you have at 05:00.0). > > > > > > > > > > > > > > Bjorn > > > > > > > > > > > > This is a standard ASUS motherboard, and don't want to disable VT-d. > > > > > > > > > > Stephen, > > > > > > > > > > Can you give the lspci -vvv of device 5:00.0 to see if it's one we've > > > > > seen before? Does the patch below help? > > > > > > > > > > Bjorn, I think we need to quirk it somehow. So far they've all been > > > > > PCI-to-PCI bridges attached to root ports where we expect it's actually > > > > > a PCIe-to-PCI bridge. Seems like maybe we could have the same attached > > > > > to a downstream port. The patch below avoids the WARN and gives us a > > > > > device, but of course pci_is_pcie reports wrong for this device and may > > > > > cause some trickle down breakage. A more complete option might be to > > > > > add a is_pcie flag to the device that can be set independent of > > > > > pcie_cap. We'd need to check all the callers for assumptions, but then > > > > > we could put the quirk in one place and hopefully fix everything. > > > > > Thoughts? Thanks, > > > > > > > > This latter approach seems like it might be easier than I expected since > > > > all the users are so well filtered through the access functions. A > > > > quick look through who uses pci_is_pcie seems like this might be > > > > complete, but more eyes are required. I'll upload this to the bz for > > > > those reporters to test as well. Thoughts? Thanks, > > > > > > > > Alex > > > > > > On my hardware this gives: > > > > > [ 0.254621] pci_bus 0000:05: busn_res: can not insert [bus 05-ff] under [bus 00-3e] (conflicts with (null) [bus 00-3e]) > > > [ 0.254647] WARNING: Your hardware is broken, device (null) appears to be a > > > [ 0.254647] Legacy PCI device attached directly to a PCIe device which is not a > > > [ 0.254647] PCIe-to-PCI bridge. Per section 7.8 of the PCI Express 3.0 spec, the > > > [ 0.254647] PCI express capability structure is required for PCI express device > > > [ 0.254647] functions. > > > [ 0.254653] pci 0000:05:00.0: [1b21:1080] type 01 class 0x060401 > > > > I guess I must be calling pci_name() before it's set. The warning > > message needs some work too, it's mainly meant for hardware vendors with > > the hope that they might test Linux and see it before shipping these > > broken devices. Bjorn, does this approach seem worth pursuing? Thanks, > > I don't know if it sways how we handle this devices, but a couple notes > on the asmedia chip. I have one in a non-VT-d capable system and an > add-in legacy PCI NIC shows up behind it when added to the system. The > chip is visible on the board and is an ASM1083. Asmedia's website of > course claims this device is fully compliant with the PCIe-to-PCI bridge > spec, ignoring the multiple statements the spec contains requiring such > devices to support a PCIe capability. > > Additionally, if you google for ASM1083 you'll find the next highest > links after the product links are bug reports that not only is this > device non-spec complaint, but it doesn't work. There seems to be an > issue with how INTx is de-asserted (or not) leading to interrupt storms > and requiring the use of irqpoll. Sure enough, the tulip card I > installed generated some of these and is operating in polling mode. The > threads indicate that these issues are not isolated to Linux and Windows > users also complain about devices not working or having poor performance > installed behind this bridge. All in all, it's an absurdly broken piece > of hardware. > > I wonder if instead of trying to work around it, we should just > blacklist the device and ignore that it even exists. Stop the bus walk > with some kind of dmesg error and provide a boot time option to scan it. > It's not the most user friendly option, but a) most people don't seem to > have anything behind it, b) it barely works if they do. Thanks, Having special case quirk makes a lot of sense, but it needs to be done with care. For people like me who don't use the PCI slots, a simple one line warning is enough (and blacklist). Fully ignoring it would probably break users that are attempting to use the PCI slots. If there are devices on that PCI bus, the kernel should print out a big warning and engage some sort of fallback like irqpoll. -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html