On 4.8.2021 11.00, Greg KH wrote: > Hi, > > I was doing some filesystem backups from one USB device to another one > this weekend and kept running into the problem of the xhci controller > shutting down after an hour or so of high volume traffic. > > I finally captured the problem in the kernel log as this would also take > out my keyboard, making it hard to recover from :) > > The log is below for when the problem happens, and then the devices are > disconnected from the bus (ignore the filesystem errors, those are > expected when i/o is in flight and we disconnect a device. > > Any hint as to what the IO_PAGE_FAULT error messages are? > No idea, unfortunately. > I'll go back to 5.13.y now and see if I can reproduce it there or not, > as my backups are not yet done... > > thanks, > > greg k-h > > > [Aug 4 09:48] xhci_hcd 0000:47:00.1: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0032 address=0xfffffff00000 flags=0x0000] > [ +0.000012] xhci_hcd 0000:47:00.1: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0032 address=0xfffffff00f80 flags=0x0000] > [ +0.000006] xhci_hcd 0000:47:00.1: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0032 address=0xfffffff01000 flags=0x0000] > [ +0.000006] xhci_hcd 0000:47:00.1: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0032 address=0xfffffff01f80 flags=0x0000] > [ +0.000005] xhci_hcd 0000:47:00.1: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0032 address=0xfffffff02000 flags=0x0000] > [ +0.000006] xhci_hcd 0000:47:00.1: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0032 address=0xfffffff02f80 flags=0x0000] > [ +0.000006] xhci_hcd 0000:47:00.1: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0032 address=0xfffffff03000 flags=0x0000] > [ +0.000005] xhci_hcd 0000:47:00.1: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0032 address=0xfffffff03f80 flags=0x0000] > [ +0.000006] xhci_hcd 0000:47:00.1: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0032 address=0xfffffff04000 flags=0x0000] > [Aug 4 09:49] sd 3:0:0:0: [sdc] tag#21 uas_eh_abort_handler 0 uas-tag 1 inflight: CMD IN > [ +0.000011] sd 3:0:0:0: [sdc] tag#21 CDB: Read(16) 88 00 00 00 00 01 8a 44 08 b0 00 00 00 08 00 00 > [ +5.106493] xhci_hcd 0000:47:00.1: xHCI host not responding to stop endpoint command. > [ +0.000010] xhci_hcd 0000:47:00.1: USBSTS: HCHalted HSE HSE "Host System Error" bit is set, meaning xHC hardware detected a serious error and stopped the host. HSE was probably set 5-10 seconds earlier, but only discovered here. Specs state: xHC sets this bit to ‘1’ when a serious error is detected, either internal to the xHC or during a host system access involving the xHC module. (In a PCI system, conditions that set this bit to ‘1’ include PCI Parity error, PCI Master Abort, and PCI Target Abort.) -Mathias