On 19/07/2020 12:09, Greg KH wrote: > On Sun, Jul 19, 2020 at 11:22:10AM +0100, Tj (Elloe Linux) wrote: >> With all kernels from 4.14 to 5.8.0-rc5 we're seeing failures with uas >> on a Turris Mox aarch64 Marvell Armada 3720 that we don't see on amd64. >> >> The device that triggers them is: >> >> Bus 003 Device 002: ID 152d:0562 JMicron Technology Corp. / JMicron USA >> Technology Corp. >> > > That implies that the host controller, or the PCI controller code, is > not working on the arm system well? With 5.8.0-rc5 we're seeing less errors (although devices are still unusable) in the log than we see with the Turris Mox kernel v4.14.187 where there are many repeated 30 second timeouts of the form: Jun 3 08:53:43 turris kernel: [ 1881.659833] scsi host0: uas_eh_device_reset_handler success Jun 3 08:53:51 turris kernel: [ 1889.671447] usb 3-1: cmd cmplt err -71 Jun 3 08:53:52 turris kernel: [ 1890.300858] usb 3-1: USB disconnect, device number 3 Jun 3 08:53:52 turris kernel: [ 1890.307043] sd 0:0:0:0: tag#1 uas_zap_pending 0 uas-tag 2 inflight: CMD >> [ 46.152049] scsi host0: uas_eh_device_reset_handler start >> [ 46.285155] usb 3-1: reset SuperSpeed Gen 1 USB device number 2 using >> xhci_hcd >> [ 46.312219] scsi host0: uas_eh_device_reset_handler success >> [ 76.827742] scsi host0: uas_eh_device_reset_handler start >> [ 76.831151] sd 0:0:0:0: [sda] tag#21 uas_zap_pending 0 uas-tag 1 >> inflight: >> [ 76.837629] sd 0:0:0:0: [sda] tag#21 CDB: opcode=0x28 28 00 1d cf 2f >> d8 00 00 28 00 >> [ 76.845513] sd 0:0:0:0: [sda] tag#20 uas_zap_pending 0 uas-tag 2 >> inflight: >> [ 76.852678] sd 0:0:0:0: [sda] tag#20 CDB: opcode=0x28 28 00 1d cf 2f >> 28 00 00 a8 00 >> [ 76.992756] usb 3-1: reset SuperSpeed Gen 1 USB device number 2 using >> xhci_hcd >> ... > > Where is an error here? Those looks ok to me. These repeated 'zaps' and resets every 30 seconds or so are not errors? They never stop even though the devices are not mounted nor being accessed (by users). >> [ 199.939976] blk_update_request: I/O error, dev sda, sector 500117464 >> op 0x0:(READ) flags 0x80700 phys_seg 5 prio class 0 > > So only the block layer is reporting errors, not the USB layer? Any usb > controller errors? > > And what USB controller driver are you using here? >From what I can deduce in sysfs it is xhci_hcd (note: same issue with 1 or 2 identical devices attached): $ lsusb -d 152d:0562 Bus 003 Device 003: ID 152d:0562 JMicron Technology Corp. / JMicron USA Technology Corp. Bus 003 Device 002: ID 152d:0562 JMicron Technology Corp. / JMicron USA Technology Corp. $ ls -l /sys/bus/usb/devices/3-{1,2} lrwxrwxrwx 1 root root 0 Jul 19 09:19 /sys/bus/usb/devices/3-1 -> ../../../devices/platform/soc/d0070000.pcie/pci0000:00/0000:00:00.0/0000:01:00.0/usb3/3-1 lrwxrwxrwx 1 root root 0 Jul 19 12:27 /sys/bus/usb/devices/3-2 -> ../../../devices/platform/soc/d0070000.pcie/pci0000:00/0000:00:00.0/0000:01:00.0/usb3/3-2 $ lspci -nnk 00:00.0 PCI bridge [0604]: Marvell Technology Group Ltd. Device [1b4b:0100] Kernel driver in use: pcieport 01:00.0 USB controller [0c03]: VIA Technologies, Inc. VL805 USB 3.0 Host Controller [1106:3483] (rev 01) Subsystem: VIA Technologies, Inc. VL805 USB 3.0 Host Controller [1106:3483] Kernel driver in use: xhci_hcd Kernel modules: xhci_pci