On Wed, Sep 09, 2009 at 04:13:44PM -0400, Isabelle, Francois wrote: > Hi. > > We are currently having an issue with PCIE hotplug of a LSI SAS1064E > embedded controller when VT-d is enabled and the IOMMU driver is loaded. > I can't tell yet if it's a fault in the iommu driver code or something > else in the platform, but things work smoothly with the iommu disabled. Generally, it's not likely to be a fault of the IOMMU code and more likely a bug in the driver (not setting up DMA properly). Enabling the IOMMU just enforces DMA activity actually DMA mappings. This isn't 100% enforcement due to performance issues, but nearly. This case might be an exception but needs more investigation. More questions below that might help lead to the root cause. > When the IOMMU is enabled (intel_iommu=on), the IOC gets in a FAULT state: > > I used this to increase the verbosity level: > > rmmod mptsas;rmmod mptscsih;rmmod mptbase;modprobe pciehp pciehp_debug=1 ;modprobe mptbase mpt_debug_level=0xFFFFFF;modprobe mptsas > > > Some details about the platform: > > lspci > 00:00.0 Host bridge: Intel Corporation 5520/5500/X58 I/O Hub to ESI Port (rev 13) > 00:03.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 3 (rev 13) > 00:04.0 PCI bridge: Intel Corporation 5520/X58 I/O Hub PCI Express Root Port 4 (rev 13) > 00:05.0 PCI bridge: Intel Corporation 5520/X58 I/O Hub PCI Express Root Port 5 (rev 13) > 00:07.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 7 (rev 13) > 00:08.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 8 (rev 13) > 00:09.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 9 (rev 13) > 00:0a.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 10 (rev 13) > 00:14.0 PIC: Intel Corporation 5520/5500/X58 I/O Hub System Management Registers (rev 13) > 00:14.1 PIC: Intel Corporation 5520/5500/X58 I/O Hub GPIO and Scratch Pad Registers (rev 13) > 00:14.2 PIC: Intel Corporation 5520/5500/X58 I/O Hub Control Status and RAS Registers (rev 13) > 00:14.3 PIC: Intel Corporation 5520/5500/X58 I/O Hub Throttle Registers (rev 13) > 00:16.0 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 13) > 00:16.1 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 13) > 00:16.2 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 13) > 00:16.3 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 13) > 00:16.4 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 13) > 00:16.5 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 13) > 00:16.6 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 13) > 00:16.7 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 13) > 00:1d.0 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #1 > 00:1d.1 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #2 > 00:1d.2 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #3 > 00:1d.7 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #1 > 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 90) > 00:1f.0 ISA bridge: Intel Corporation 82801JIR (ICH10R) LPC Interface Controller > 00:1f.2 SATA controller: Intel Corporation 82801JI (ICH10 Family) SATA AHCI Controller > 00:1f.3 SMBus: Intel Corporation 82801JI (ICH10 Family) SMBus Controller > 02:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01) > 02:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01) > 03:00.0 Ethernet controller: Intel Corporation Device 10f7 > 03:00.1 Ethernet controller: Intel Corporation Device 10f7 > 04:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01) > 04:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01) > 06:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1064ET PCI-Express Fusion-MPT SAS (rev 08) > > And here is some data when the problem occurs. Really need to entire console output from boot. ACPI spews info about the IOMMU resources that are relevant to debugging it. > pt_debug_level=ffffffh > mptbase: ioc1: : mpt_adapter_install > mptsas 0000:06:00.0: enabling device (0000 -> 0002) > mptsas 0000:06:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 > mptbase: ioc1: : 32 BIT PCI BUS DMA ADDRESSING SUPPORTED > mptbase: ioc1: mem = ffffc90010570000, mem_phys = d8010000 > mptbase: ioc1: facts @ ffff88003d5df41c, pfacts[0] @ ffff88003d5df46c > mptbase: ioc1: Initiating bringup > mptbase: ioc1: MakeIocReady [raw] state=10000000 > mptbase: ioc1: IOC is in READY state > 03000000 00000000 00000000 > 03140105 00001400 00000000 00000000 00000000 00080022 002001ff 27040000 00000000 00010115 00000000 01700000 00000000 00000807 011b0000 00000100 00000000 00000000 00000000 00000000 > 05000000 00000000 00000000 > 050a0000 00000000 00000000 00000000 00000000 003f3000 00090070 00700001 00000000 00000000 > ioc1: LSISAS1064E B3: Capabilities={Initiator} > mptbase: ioc1: installed at interrupt 16 > mptbase: ioc1: PrimeIocFifos PrimeIocFifos calls pci_alloc_consistent() and has debug code to dump the DMA resource allocated (both virtual and DMA addresses). Off hand I don't know how to enable that but it would be the next step. This code is broken in that it's calling pci_alloc_consistent() before calling pci_set_consistent_dma_mask(). This is almost certain to cause problems if ioc->dma_mask is not DMA_BIT_MASK(32). Move the pci_set_dma*mask() calls to the beginning of the function. > mptbase: ioc1: SendIocInit > 02000004 00017000 00000000 > 02050004 00017000 00000000 00000000 00000000 > 06000000 00000000 00000000 > DRHD: handling fault status reg 2 > DMAR:[DMA Read] Request device [06:00.0] fault addr fffc2000 fffc2000 seems to be an unusual address to DMA from/to. Is fffc2000 reserved address space for the IOMMU? (ACPI DMAR info should tell us this) > DMAR:[fault reason 06] PTE Read access is not set It's also odd that "Read Access is not set" for something (ioc_init) that I think should be bi-directional. Need to track down the code in MPT driver which prepares the DMA activity for SendIocInit and compare to the PTE's access rights. Looking at Documentation/Intel-IOMMU.txt, fed9x000 seems to be the base address of the IOMMU page table. But I don't know which PCI address range is reserved for the IOMMU to decode. (Someone from Intel can probably tell based on chipset) hth, grant > mptbase: ioc1: WARNING - Issuing Reset from mpt_config!! > mptbase: ioc1: Initiating recovery > mptbase: ioc1: MakeIocReady [raw] state=40002000 > mptbase: ioc1: WARNING - IOC is in FAULT state!!! > mptbase: ioc1: WARNING - FAULT code = 2000h > mptbase: ioc1: Recovered from IOC FAULT > 03000000 00000000 00000000 > 03140105 00001400 00000000 00000000 00000000 00080022 002001ff 27040000 00000000 00010115 00000000 01700000 00000000 00000807 011b0000 00000100 00000000 00000000 00000000 00000000 > mptbase: ioc1: PrimeIocFifos > mptbase: ioc1: SendIocInit > 02000004 00017000 00000000 > 02050004 00017000 00000000 00000000 00000000 > 06000000 00000000 00000000 > > > Here is my current hypothesis: > > For some reason, on hotplug reactivation, the resources claimed by the IOC are not the same that were used before and for which the IOMMU has a translation enabled and subsequent DMA access are rejected. > > But I'm having a hard time figuring where to look at first, should the resource assigned exactly the same? How is the IOMMU supposed to deal with hot removal of PCI endpoint devices? > > Thank you. > > -- > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html