On 21.12.2022 0.12, Joe Bolling wrote:
[1.] One line summary of the problem: Error 110 from ASMedia Host Controller [2.] Full description of the problem/report: I'm seeing a failure from XHCI_HCD when I stream video from Intel Realsense D435 cameras through an ASMedia ASM3042 USB host controller. The issue usually manifests as repeated Error 110s from the camera as long as I'm trying to stream data: [ 100.227800] usb 4-1: Failed to query (SET_CUR) UVC control 1 on unit 3: -110 (exp. 1024). Followed by a bit of a lockup from XHCI_HCD. lsusb will hang and I can't get any image data from the camera. This problem seems to happen sooner when there are multiple cameras connected and streaming. In the logs below, I'm streaming from four cameras, two connected to an ASM3042 and two via an Intel host controller. It seems to happen when I stop and re-start streaming from the cameras repeatedly. dmesg and tracing output are located in this folder: https://bostondynamics1.box.com/s/qtn28it8avda6pvve5sowyaeff4jzlyr
Had a quick look at the trace. The control transfer that times out is queued at: 95.030596: xhci_urb_enqueue: ep0out-control: urb 000000005be6faad pipe 2147484160 slot 1 length 0/1024 sgs 0/0 stream 0 flags 00110000 95.030597: xhci_queue_trb: CTRL: bRequestType 21 bRequest 01 wValue 0100 wIndex 0300 wLength 1024 length 8 TD size 0 intr It never completes so it's cancelled after 5 seconds. xhci driver stops the endpoint to remove the cancelled transfer. 100.268771: xhci_urb_dequeue: ep0out-control: urb 000000001ac66029 pipe 2147484160 slot 1 length 0/1024 sgs 0/0 stream 0 flags 00110000 100.268797: xhci_dbg_cancel_urb: Cancel URB 000000001ac66029, dev 1, ep 0x0, starting at offset 0x16f13f9c0 100.268804: xhci_queue_trb: CMD: Stop Ring Command: slot 1 sp 0 ep 1 flags c Trace is a bit hard to follow as we can't distinguish between hosts. Also seems that some events are just missing from trace. Control transfers on ep0 fails after this. No idea why this transfer does not complete, but I'd start by taking a better look at the 'Context state error' responses to stop endpoint commands. This error should mostly occur when a stop endpoint command races with an error on the endpoint, and in these cases the endpoint state should be "error" or "halted". In this trace endpoint state often stopped. 105.388818: xhci_queue_trb: CMD: Stop Ring Command: slot 1 sp 0 ep 1 flags c 105.389099: xhci_handle_event: EVENT: TRB 000000016f583260 status 'Context State Error' len 0 slot 1 ep 0 type 'Command Completion Event' flags e:c 105.389101: xhci_handle_command: CMD: Stop Ring Command: slot 1 sp 0 ep 1 flags c 105.389104: xhci_handle_cmd_stop_ep: State stopped mult 1 max P. Streams 0 interval 125 us max ESIT payload 0 CErr 3 Type Ctrl burst 0 maxp 512 deq 000000016f13f9f1 avg trb len 0 Many other control transfer requests in this log cause a protocol stall, meaning device doesn't support the request. xHC will halt the host side of an endpoint on both functional and protocol stalls, and needs to be recovered with a reset endpoint command. Maybe ASMedia host side endpoint is really halted even if it reports stopped, and needs a reset endpoint command to recover? Thanks Mathias