RE: USB transaction errors causing RCU stalls and kernel panics

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



 
> >>
> >> I recently got a report about a bit similar issue on a 4.4 stable
> >> kernel, so this might not be xhci-cdns specific.
> >>
> >> That case involved autosuspend of the cdc-acm, and there was only a
> >> short burst of transaction erros and resubmitted URBs even if the device was
> supposed to be suspended.
> >> It looks like cdc_acm autosuspended even if it had URBs pending.
> >
> > That must not happen. Do you have details?
> 
> Shared what I got in:
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdrive.google.
> com%2Fopen%3Fid%3D1PjwmIK97bIfugWL697lJCe65yuDVOZhx&data=02%
> 7C01%7Cpeter.chen%40nxp.com%7C3c5815ca5ec345be017508d7c057cc6d%7C6
> 86ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637189355630062080&
> sdata=O146bkkswCyFhoBTy92uDtMY%2Bn43ho78%2F69IxjuXE2M%3D&res
> erved=0
> 
> cdc-acm is unfortunately not traced, but usb core and xhci is.
> grep for "MATTU" to find the relevant places.
> 
> >
> >> I'm guessing that in that case the transfer ring restarted even if
> >> link was already "suspeded", causing transaction errors. Ring could
> >> be restarted if URBs were resubmitted by the class driver when usb
> >> core suspends all interfaces, flushing all pending URBs which calls the URB
> completion handler.
> >
> > How should a class driver do that? It gets -EPROTO and that's it,
> >
> > 	Regards
> > 		Oliver
> >
> 
> This is just my speculation, I haven't checked details yet.
> 
> It's not just the class driver that causes this, but a combination of the following gaps
> in xhci, cdc-acm, and usb core.
> 
> A class driver autosuspending with URBs pending, and not killing all URBs
> synchronously when usb core calls suspend for the interface drivers.
> 
> xhci restarting an endpoint ring due to a race between xhci_stop_device(), handling
> a Transaction error, and having pending URB (unhandled trb) on the ring. [1]
> 
> Sleeping between setting port to U3 link state, and clearing udev->can_submit.
> allowing URBs to be submitted during that gap when link is in U3 already.
> URBs will complete with -EPROTO, and resubmitted until udev->can_submit is
> cleared.
> 
> usb_runtime_suspend()
>   usb_suspend_both() // suspend a device and its interfaces
>     for (each udev->actconfig->interface)
>       usb_suspend_interface(udev, intf, msg);
> 	driver = to_usb_driver(intf->dev.driver);
> 	status = driver->suspend(intf, msg);  // URBs shuold be killed, sync, miss
> one??
>     usb_suspend_device()
>       generic_suspend()
>         usb_port_suspend()
>           hub_set_port_link_state(USB_SS_PORT_LS_U3)
>             xhci_stop_device()
>             xhci_set_link_state(USB_SS_PORT_LS_U3)  // port link is in U3
>             msleep()       // during this URBs can be resubmitteded and complete with -
> EPROTO loop
>     udev->can_submit = 0;  // we can submit URBs until here (except the killed ones,
> they are flagged).
>     for (each endpoint)
>       usb_hcd_flush_endpoint(udev, udev->ep_out[i]);
>       usb_hcd_flush_endpoint(udev, udev->ep_in[i]);
> 
> [1]
> xhci_stop_device()
>   xhci_queue_stop_endpoint()
> -> interrupt (transfer event, ring has not stopped yet)
>   handle_tx_event()  // bulk transfer with Transaction error
>     process_bulk_intr_td()
>       finish_td()
> 	xhci_cleanup_halted_endpoint()
>           xhci_queue_reset_ep()
>           xhci_queue_new_dequeue_state()
> -> interupt, command completion event for stop endpoint, interrupt
> -> handle reset_ep command, xhci_handle_cmd_stop_ep() interrupt handle
> -> new deq state
>       xhci_handle_cmd_set_deq()
> 	ring_doorbell_for_active_rings(xhci, slot_id, ep_index)  - this restarts the ring.
> 

If autosuspend is suspicious, Jonas, could you please try to disable autosuspend for all
USB devices (including the roothub and controller) to see what happens?

Peter




[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux