> >> > >> I recently got a report about a bit similar issue on a 4.4 stable > >> kernel, so this might not be xhci-cdns specific. > >> > >> That case involved autosuspend of the cdc-acm, and there was only a > >> short burst of transaction erros and resubmitted URBs even if the device was > supposed to be suspended. > >> It looks like cdc_acm autosuspended even if it had URBs pending. > > > > That must not happen. Do you have details? > > Shared what I got in: > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdrive.google. > com%2Fopen%3Fid%3D1PjwmIK97bIfugWL697lJCe65yuDVOZhx&data=02% > 7C01%7Cpeter.chen%40nxp.com%7C3c5815ca5ec345be017508d7c057cc6d%7C6 > 86ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637189355630062080& > sdata=O146bkkswCyFhoBTy92uDtMY%2Bn43ho78%2F69IxjuXE2M%3D&res > erved=0 > > cdc-acm is unfortunately not traced, but usb core and xhci is. > grep for "MATTU" to find the relevant places. > > > > >> I'm guessing that in that case the transfer ring restarted even if > >> link was already "suspeded", causing transaction errors. Ring could > >> be restarted if URBs were resubmitted by the class driver when usb > >> core suspends all interfaces, flushing all pending URBs which calls the URB > completion handler. > > > > How should a class driver do that? It gets -EPROTO and that's it, > > > > Regards > > Oliver > > > > This is just my speculation, I haven't checked details yet. > > It's not just the class driver that causes this, but a combination of the following gaps > in xhci, cdc-acm, and usb core. > > A class driver autosuspending with URBs pending, and not killing all URBs > synchronously when usb core calls suspend for the interface drivers. > > xhci restarting an endpoint ring due to a race between xhci_stop_device(), handling > a Transaction error, and having pending URB (unhandled trb) on the ring. [1] > > Sleeping between setting port to U3 link state, and clearing udev->can_submit. > allowing URBs to be submitted during that gap when link is in U3 already. > URBs will complete with -EPROTO, and resubmitted until udev->can_submit is > cleared. > > usb_runtime_suspend() > usb_suspend_both() // suspend a device and its interfaces > for (each udev->actconfig->interface) > usb_suspend_interface(udev, intf, msg); > driver = to_usb_driver(intf->dev.driver); > status = driver->suspend(intf, msg); // URBs shuold be killed, sync, miss > one?? > usb_suspend_device() > generic_suspend() > usb_port_suspend() > hub_set_port_link_state(USB_SS_PORT_LS_U3) > xhci_stop_device() > xhci_set_link_state(USB_SS_PORT_LS_U3) // port link is in U3 > msleep() // during this URBs can be resubmitteded and complete with - > EPROTO loop > udev->can_submit = 0; // we can submit URBs until here (except the killed ones, > they are flagged). > for (each endpoint) > usb_hcd_flush_endpoint(udev, udev->ep_out[i]); > usb_hcd_flush_endpoint(udev, udev->ep_in[i]); > > [1] > xhci_stop_device() > xhci_queue_stop_endpoint() > -> interrupt (transfer event, ring has not stopped yet) > handle_tx_event() // bulk transfer with Transaction error > process_bulk_intr_td() > finish_td() > xhci_cleanup_halted_endpoint() > xhci_queue_reset_ep() > xhci_queue_new_dequeue_state() > -> interupt, command completion event for stop endpoint, interrupt > -> handle reset_ep command, xhci_handle_cmd_stop_ep() interrupt handle > -> new deq state > xhci_handle_cmd_set_deq() > ring_doorbell_for_active_rings(xhci, slot_id, ep_index) - this restarts the ring. > If autosuspend is suspicious, Jonas, could you please try to disable autosuspend for all USB devices (including the roothub and controller) to see what happens? Peter