On 09/10/17 16:49, Robin Murphy wrote: > On 09/10/17 10:22, Mathias Nyman wrote: >> On 08.10.2017 17:03, Hao Wei Tee wrote: >>> Hi, >>> >>> I've been having DMA read faults with my VL805 xHCI controller when >>> the Intel IOMMU >>> is turned on: >>> >>> xhci_hcd 0000:03:00.0: xHCI Host Controller >>> xhci_hcd 0000:03:00.0: new USB bus registered, assigned bus number 2 >>> DMAR: DRHD: handling fault status reg 3 >>> DMAR: [DMA Read] Request device [03:00.0] fault addr de28a000 >>> [fault reason 01] Present bit in root entry is clear >>> <snip many identical DMAR faults> >>> xhci_hcd 0000:03:00.0: can't setup: -110 >>> xhci_hcd 0000:03:00.0: USB bus 2 deregistered >>> xhci_hcd 0000:03:00.0: init 0000:03:00.0 fail, -110 >>> xhci_hcd: probe of 0000:03:00.0 failed with error -110 >>> >>> The controller works fine, as far as I can tell, when the IOMMU is off. >>> >>> I've tracked it down to where CMD_RESET is sent to the controller in >>> xhci_reset, >>> [1] called from xhci_gen_setup in xhci.c. It seems that when the >>> command register >>> is being polled in the xhci_handshake after that, the controller tries >>> to do a >>> DMA read from an address that is apparently invalid (?). Eventually >>> xhci_handshake >>> times out. >>> >>> I've tried setting the XHCI_NO_64BIT_SUPPORT quirks flag as someone >>> suggested in >>> an earlier thread here [2] about a similar/the same(?) device, but >>> that doesn't >>> seem to have worked. >>> >>> Help, please. I have no idea how to debug this further. >>> >> >> Could it maybe be related to a iommu/vt-d: Fix scatterlist offset >> handling fix: >> https://lists.linuxfoundation.org/pipermail/iommu/2017-September/024371.html >> >> >> Can you check if that patch is included? >> >> The author Robin Murphy (CC) Also had some recent issues with a VIA >> VL805 controller >> >> https://marc.info/?l=linux-usb&m=150730678304383&w=2 > > I'm pretty confident this is unrelated to the intel-iommu issue that > my patch above addresses. On my arm64 test system, the VL805 is > consistently playing up even *without* an IOMMU - dd'ing from a USB3 > mass storage device throws up a series of block layer errors like this: > > [ 138.658733] sd 2:0:0:0: [sdb] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x07 driverbyte=0x00 > [ 138.666853] sd 2:0:0:0: [sdb] tag#0 CDB: opcode=0x28 28 00 00 00 00 a8 00 01 80 00 > [ 138.674369] print_req_error: I/O error, dev sdb, sector 168 > > > Brain dump so far: > > - I can reliably produce these errors using dd with a block size of > 128K or greater; more generally they seem correlated with > dma_map_sg() calls where the scatterlist is 32 or more entries long. > > - without the IOMMU, block sizes >=128K all settle down into a > suspiciously-periodic error every 2048 sectors. > > - with the IOMMU, the faulting write address is always the first byte > of a page immediately following a valid XHCI DMA mapping; I'm no USB > expert, but having now generated the debug log below, this might > actually just be a symptom of the queue getting out of whack earlier. > > - FWIW, neither XHCI_NO_64BIT_SUPPORT as mentioned in the other thread, > nor XHCI_BROKEN_STREAMS per the other VIA quirk, makes any visible > difference. > > - The same device works quite happily in USB 2.0 ports on the same > system (via on-SoC EHCI), and with a different USB 3.0 PCIe card > based on a Renesas uPD720201. Actually, I tell a lie there - I was getting confused with the results from the USB3-ethernet adapter. With the Renesas card, dd'ing from the USB3-SATA adapter *does* still generate the same periodic error every 2048 sectors with block sizes >= 128K, but it recovers an awful lot quicker each time, and never triggers IOMMU faults. The USB 2.0 host has no issues. Robin. --->8--- lsusb -v output for this cheap no-name adapter plugged into the Renesas card, complete with 100% reproducible stall in the process: Bus 004 Device 003: ID 13fd:3940 Initio Corporation Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 3.00 bDeviceClass 0 bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize0 9 idVendor 0x13fd Initio Corporation idProduct 0x3940 bcdDevice 3.09 iManufacturer 1 TS1GSDOM iProduct 2 22V iSerial 3 32303131313230313030303041303030 bNumConfigurations 1 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 44 bNumInterfaces 1 bConfigurationValue 1 iCo[ 1014.305508] xhci_hcd 0000:04:00.0: Stalled endpoint for slot 1 ep 0 [ 1014.314281] xhci_hcd 0000:04:00.0: Cleaning up stalled endpoint ring [ 1014.320574] xhci_hcd 0000:04:00.0: Finding endpoint context [ 1014.326094] xhci_hcd 0000:04:00.0: Cycle state = 0x1 [ 1014.331010] xhci_hcd 0000:04:00.0: New dequeue segment = ffff800053590600 (virtual) [ 1014.338593] xhci_hcd 0000:04:00.0: New dequeue pointer = 0xffabf940 (DMA) [ 1014.345314] xhci_hcd 0000:04:00.0: Queueing new dequeue state [ 1014.351008] xhci_hcd 0000:04:00.0: Set TR Deq Ptr cmd, new deq seg = ffff800053590600 (0xffabf000 dma), new deq ptr = ffff00000a655940 (0xffabf940 dma), new cycle = 1 [ 1014.365730] xhci_hcd 0000:04:00.0: // Ding dong! [ 1014.370307] xhci_hcd 0000:04:00.0: Giveback URB ffff80005b0ad000, len = 0, expected = 4, status = -32 [ 1014.379476] xhci_hcd 0000:04:00.0: Ignoring reset ep completion code of 1 [ 1014.386206] xhci_hcd 0000:04:00.0: Successful Set TR Deq Ptr cmd, deq = @ffabf940 nfiguration 0 bmAttributes 0x80 (Bus Powered) MaxPower 36mA Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 0 bAlternateSetting 0 bNumEndpoints 2 bInterfaceClass 8 Mass Storage bInterfaceSubClass 6 SCSI bInterfaceProtocol 80 Bulk-Only iInterface 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x83 EP 3 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 7 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x0a EP 10 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 7 Binary Object Store Descriptor: bLength 5 bDescriptorType 15 wTotalLength 22 bNumDeviceCaps 2 USB 2.0 Extension Device Capability: bLength 7 bDescriptorType 16 bDevCapabilityType 2 bmAttributes 0x00000002 HIRD Link Power Management (LPM) Supported SuperSpeed USB Device Capability: bLength 10 bDescriptorType 16 bDevCapabilityType 3 bmAttributes 0x00 wSpeedsSupported 0x000e Device can operate at Full Speed (12Mbps) Device can operate at High Speed (480Mbps) Device can operate at SuperSpeed (5Gbps) bFunctionalitySupport 1 Lowest fully-functional device speed is Full Speed (12Mbps) bU1DevExitLat 10 micro seconds bU2DevExitLat 128 micro seconds can't get debug descriptor: Resource temporarily unavailable Device Status: 0x0000 (Bus Powered) -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html