Re: [PATCH 12/15] xhci: Prevent early endpoint restart when handling STALL errors.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 7.3.2025 8.54, Michał Pecio wrote:
Ensure that an endpoint halted due to device STALL is not
restarted before a Clear_Feature(ENDPOINT_HALT) request is sent to
the device.

The host side of the endpoint may otherwise be started early by the
'Set TR Deq' command completion handler which is called if dequeue
is moved past a cancelled or halted TD.

Prevent this with a new flag set for bulk and interrupt endpoints
when a Stall Error is received. Clear it in hcd->endpoint_reset()
which is called after Clear_Feature(ENDPOINT_HALT) is sent.

Also add a debug message if a class driver queues a new URB after
the STALL. Note that class driver might not be aware of the STALL
yet when it submits the URB as URBs are given back in BH.

Signed-off-by: Mathias Nyman <mathias.nyman@xxxxxxxxxxxxxxx>

Sorry for coming this late, but I haven't looked closely at some
of those xhci/for-next patches before.

This one is unfortunately incomplete, as follows:

drivers/usb/host/xhci-ring.c | 7 +++++--
drivers/usb/host/xhci.c      | 6 ++++++
drivers/usb/host/xhci.h      | 3 ++-
3 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index c2e15a27338b..7643ab9ec3b4 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -556,8 +556,8 @@ void xhci_ring_ep_doorbell(struct xhci_hcd *xhci,
	 * pointer command pending because the device can choose to start any
	 * stream once the endpoint is on the HW schedule.
	 */
-	if ((ep_state & EP_STOP_CMD_PENDING) || (ep_state & SET_DEQ_PENDING) ||
-	    (ep_state & EP_HALTED) || (ep_state & EP_CLEARING_TT))
+	if (ep_state & (EP_STOP_CMD_PENDING | SET_DEQ_PENDING | EP_HALTED |
+			EP_CLEARING_TT | EP_STALLED))
		return;

Any flag added to this list needs to be added to xhci_urb_dequeue() too
so it knowns that the endpoint is held in Stopped state and URBs can be
unlinked without trying to stop it again.

In this case it's intentional.

If we prevent xhci_urb_dequeue() from queuing a stop endpoint command due to a flag,
then we must make sure the cancelled URB is given back in the same place we clear
the flag, like we do in the command completion handlers that clear EP_HALTED and
 SET_DEQ_PENDING.

The EP_STALLED flag is cleared after a ClearFeature(ENDPOINT_HALT) control transfer
request is (successfully?) sent to the device.
If we only give back those cancelled URBs after this then we create a situation where
cancelled urb giveback is blocked and depend on the completion of another transfer
on a different endpoint.
I don't want this dependency.

It's possible that this could create some type of deadlock where class driver ends
up waiting for cancelled URBs to be given back before it sends the request to clear
the halt, and  xhci won't give back the cancelld URBs before the
ClearFeature(ENDPOINT_HALT) request completes..

Lets look at the cases where xhci_urb_dequeue() is called between setting and clearing
this new EP_STALLED flag.

The EP_HALTED is set during same spinlock as EP_STALLED, so urbs dequeued during this time
will be added to cancelled list, and given back in xhci_handle_cmd_reset_ep() completion
handler where also EP_HALTED is cleared. If dequeue needs to be moved then SET_DEQ_PENDING
is set, and cancelled urbs will be given back in xhci_handle_cmd_set_deq() completion handler.

At this stage we know endpoint is in stopped state. and will remauin so until EP_STALLED is cleared.
if xhci_urb_dequeue() is called now then a stop endpoint command will ne queued,
it will complete with a context state error due to endpoint already being stopped, but
URB will be given back in one of the completion handlers. mentioned before.

We could improve this codepath a bit by adding:

iff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 0f8acbb9cd21..c8d1651c9703 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -1244,7 +1244,7 @@ static void xhci_handle_cmd_stop_ep(struct xhci_hcd *xhci, int slot_id,
                         * Endpoint later. EP state is now Stopped and EP_HALTED
                         * still set because Reset EP handler will run after us.
                         */
-                       if (ep->ep_state & EP_HALTED)
+                       if (ep->ep_state & (EP_HALTED | EP_STALLED)
                                break;
                        /*
                         * On some HCs EP state remains Stopped for some tens of

  	case USB_ENDPOINT_XFER_CONTROL:
@@ -3202,6 +3207,7 @@ static void xhci_endpoint_reset(struct usb_hcd *hcd,
  		return;
ep = &vdev->eps[ep_index];
+	ep->ep_state &= ~EP_STALLED;

... and clearing any of those flags has always been followed by calling
xhci_ring_ep_doorbell() again, to ensure that the endpoint is restarted
if it has URBs on it but restart was held off due to the flag.


Probably no harm in ringing the doorbell here. Should not be needed as there
shouldn't be any pending URBs, see usb core message.c comment for usb_clear_halt():

 * This is used to clear halt conditions for bulk and interrupt endpoints,
 * as reported by URB completion status.  Endpoints that are halted are
 * sometimes referred to as being "stalled".  Such endpoints are unable
 * to transmit or receive data until the halt status is cleared.  Any URBs
 * queued for such an endpoint should normally be unlinked by the driver
 * before clearing the halt condition, as described in sections 5.7.5
 * and 5.8.5 of the USB 2.0 spec.

But I don't see any harm in ringing the doorbell here either.

Thanks
Mathias




[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux