Hi Mathias, So the problem I found with v4.10-rc1 doesn't appear to be a regression. I can't, however, trigger it with Broadwell, only Skylake and Kabylake. According to tracepoints, our Reset Device Command sometimes completes with "Context State Error", which tells us that we're issuing Reset Device Command when we shouldn't. Another thing I noticed is that we're clearing PortFeature(PortReset) less than 20ms after setting it. Full tracepoint data attached (slot 28 is the device in question. Slot 12, AFAICT, it's parent hub), but here's a snippet showing both problems: > device-reset-2819 [002] .... 154.793058: xhci_urb_enqueue: ep0out-control: urb ffff8801b8cc90c0 pipe 2147485696 slot 12 length 0/0 sgs 0/0 stream 0 flags 00100000 > device-reset-2819 [002] d..1 154.793059: xhci_queue_trb: CTRL: bRequestType 23 bRequest 03 wValue 0004 wIndex 0001 wLength 0 length 8 TD size 0 intr 0 type 'Setup Stage' flags b:I:i:c:s:i:e:c SetPorFeature(PortReset, 1); > device-reset-2819 [002] d..1 154.793059: xhci_queue_trb: CTRL: Buffer 0000000000000000 length 0 TD size 0 intr 0 type 'Status Stage' flags b:i:I:c:s:i:e:C > <idle>-0 [002] d.h2 154.805677: xhci_handle_event: EVENT: TRB 00000001bd90f2b0 status 'Success' len 0 slot 12 ep 1 type 'Transfer Event' flags e:C > <idle>-0 [002] d.h2 154.805679: xhci_handle_transfer: CTRL: Buffer 0000000000000000 length 0 TD size 0 intr 0 type 'Status Stage' flags b:i:I:c:s:i:e:C > <idle>-0 [002] dNh1 154.805682: xhci_urb_giveback: ep0out-control: urb ffff8801b8cc90c0 pipe 2147485696 slot 12 length 0/0 sgs 0/0 stream 0 flags 00000000 Hub acked the transfer. Note success status above. Also note timestamp. > device-reset-2819 [002] .... 154.816800: xhci_urb_enqueue: ep0out-control: urb ffff8801b8cc90c0 pipe 2147485824 slot 12 length 0/4 sgs 0/0 stream 0 flags 00110200 > device-reset-2819 [002] d..1 154.816800: xhci_queue_trb: CTRL: bRequestType a3 bRequest 00 wValue 0000 wIndex 0001 wLength 4 length 8 TD size 0 intr 0 type 'Setup Stage' flags b:I:i:c:s:i:e:c GetPortStatus(1); > device-reset-2819 [002] d..1 154.816800: xhci_queue_trb: CTRL: Buffer 00000001bd6dada0 length 4 TD size 0 intr 0 type 'Data Stage' flags b:i:i:c:s:I:e:C > device-reset-2819 [002] d..1 154.816800: xhci_queue_trb: CTRL: Buffer 0000000000000000 length 0 TD size 0 intr 0 type 'Status Stage' flags b:i:I:c:s:i:e:C > <idle>-0 [002] d.h2 154.816909: xhci_handle_event: EVENT: TRB 00000001bd90f2e0 status 'Success' len 0 slot 12 ep 1 type 'Transfer Event' flags e:C > <idle>-0 [002] d.h2 154.816910: xhci_handle_transfer: CTRL: Buffer 0000000000000000 length 0 TD size 0 intr 0 type 'Status Stage' flags b:i:I:c:s:i:e:C > <idle>-0 [002] dNh1 154.816910: xhci_urb_giveback: ep0out-control: urb ffff8801b8cc90c0 pipe 2147485824 slot 12 length 4/4 sgs 0/0 stream 0 flags 00000200 > device-reset-2819 [002] .... 154.816914: xhci_urb_enqueue: ep0out-control: urb ffff8801b8cc90c0 pipe 2147485696 slot 12 length 0/0 sgs 0/0 stream 0 flags 00100000 > device-reset-2819 [002] d..1 154.816914: xhci_queue_trb: CTRL: bRequestType 23 bRequest 01 wValue 0014 wIndex 0001 wLength 0 length 8 TD size 0 intr 0 type 'Setup Stage' flags b:I:i:c:s:i:e:c CleartPortFeature(C_PortReset, 1); > device-reset-2819 [002] d..1 154.816915: xhci_queue_trb: CTRL: Buffer 0000000000000000 length 0 TD size 0 intr 0 type 'Status Stage' flags b:i:I:c:s:i:e:C > <idle>-0 [002] d.h2 154.817056: xhci_handle_event: EVENT: TRB 00000001bd90f300 status 'Success' len 0 slot 12 ep 1 type 'Transfer Event' flags e:C > <idle>-0 [002] d.h2 154.817056: xhci_handle_transfer: CTRL: Buffer 0000000000000000 length 0 TD size 0 intr 0 type 'Status Stage' flags b:i:I:c:s:i:e:C > <idle>-0 [002] dNh1 154.817057: xhci_urb_giveback: ep0out-control: urb ffff8801b8cc90c0 pipe 2147485696 slot 12 length 0/0 sgs 0/0 stream 0 flags 00000000 completed. It took 0.011375 S, or 11.375 mS. Even though we had a GetPortStatus() in between. This should've been more than 20 mS, right? > device-reset-2819 [002] d..1 154.868782: xhci_queue_trb: CMD: Reset Device Command: slot 28 flags C > <idle>-0 [003] d.h2 154.868832: xhci_handle_event: EVENT: TRB 00000001be2b2e60 status 'Context State Error' len 0 slot 28 ep 0 type 'Command Completion Event' flags e:C And here's xHCI telling us that slot 28's Context was is invalid status. > <idle>-0 [003] d.h2 154.868832: xhci_handle_command: CMD: Reset Device Command: slot 28 flags C > device-reset-2819 [002] .... 154.868961: xhci_dbg_address: Successful setup address command > device-reset-2819 [002] .... 154.868963: xhci_dbg_address: Op regs DCBAA ptr = 0x000001be2b3000 > device-reset-2819 [002] .... 154.868964: xhci_dbg_address: Slot ID 28 dcbaa entry @ffff8801be2b30e0 = 0x000001b8f89000 > device-reset-2819 [002] .... 154.868965: xhci_dbg_address: Output Context DMA address = 0x1b8f89000 > device-reset-2819 [002] .... 154.868966: xhci_address_ctx: ctx_64=0, ctx_type=2, ctx_dma=@1b8f88000, ctx_va=@ffff8801b8f88000 > device-reset-2819 [002] .... 154.868966: xhci_address_ctx: ctx_64=0, ctx_type=1, ctx_dma=@1b8f89000, ctx_va=@ffff8801b8f89000 > device-reset-2819 [002] .... 154.868966: xhci_dbg_address: Internal device address = 28 however we happily move forward as if nothing happened. For reference, here's dmesg: > [ 153.234970] usb 1-9.3.1: reset full-speed USB device number 24 using xhci_hcd > [ 153.619966] usb 1-9.3.1: reset full-speed USB device number 24 using xhci_hcd > [ 154.004947] usb 1-9.3.1: reset full-speed USB device number 24 using xhci_hcd > [ 154.389955] usb 1-9.3.1: reset full-speed USB device number 24 using xhci_hcd > [ 154.774946] usb 1-9.3.1: reset full-speed USB device number 24 using xhci_hcd > [ 159.933798] usb 1-9.3.1: device descriptor read/all, error -110 > [ 160.016965] usb 1-9.3.1: reset full-speed USB device number 24 using xhci_hcd > [ 160.101818] usb 1-9.3.1: device descriptor read/64, error -22 > [ 160.288820] usb 1-9.3.1: device descriptor read/64, error -22 > [ 160.479818] usb 1-9.3.1: reset full-speed USB device number 24 using xhci_hcd > [ 160.488041] usb 1-9.3.1: Device not responding to setup address. > [ 160.699920] usb 1-9.3.1: Device not responding to setup address. > [ 160.915801] usb 1-9.3.1: device not accepting address 24, error -71 > [ 160.998818] usb 1-9.3.1: reset full-speed USB device number 24 using xhci_hcd > [ 161.007068] usb 1-9.3.1: Device not responding to setup address. > [ 161.219917] usb 1-9.3.1: Device not responding to setup address. > [ 161.435803] usb 1-9.3.1: device not accepting address 24, error -71 > [ 161.443478] usb 1-9.3.1: USB disconnect, device number 24 > [ 161.553801] usb 1-9.3.1: new full-speed USB device number 25 using xhci_hcd > [ 161.638816] usb 1-9.3.1: device descriptor read/64, error -22 > [ 161.824818] usb 1-9.3.1: device descriptor read/64, error -22 > [ 162.016799] usb 1-9.3.1: new full-speed USB device number 26 using xhci_hcd > [ 162.100818] usb 1-9.3.1: device descriptor read/64, error -22 > [ 162.288817] usb 1-9.3.1: device descriptor read/64, error -22 > [ 162.479798] usb 1-9.3.1: new full-speed USB device number 27 using xhci_hcd > [ 162.487844] usb 1-9.3.1: Device not responding to setup address. > [ 162.699918] usb 1-9.3.1: Device not responding to setup address. > [ 162.907800] usb 1-9.3.1: device not accepting address 27, error -71 > [ 162.991799] usb 1-9.3.1: new full-speed USB device number 28 using xhci_hcd > [ 162.999844] usb 1-9.3.1: Device not responding to setup address. > [ 163.211916] usb 1-9.3.1: Device not responding to setup address. > [ 163.419800] usb 1-9.3.1: device not accepting address 28, error -71 > [ 163.427063] usb 1-9.3-port1: unable to enumerate USB device attached a tarball with both dmesg.txt and trace.txt
Attachment:
xhci-trace-dmesg.tar.bz2
Description: Binary data
-- balbi
Attachment:
signature.asc
Description: PGP signature