On Fri, Oct 11, 2019 at 03:58:42PM +0300, Mathias Nyman wrote: > commit ef513be0a905 ("usb: xhci: Add Clear_TT_Buffer") schedules work > to clear TT buffer, but causes a use-after-free regression at the same time > > Make sure hub_tt_work finishes before endpoint is disabled, otherwise > the work will dereference already freed endpoint and device related > pointers. > > This was triggered when usb core failed to read the configuration > descriptor of a FS/LS device during enumeration. > xhci driver queued clear_tt_work while usb core freed and reallocated > a new device for the next enumeration attempt. > > EHCI driver implents ehci_endpoint_disable() that makes sure > clear_tt_work has finished before it returns, but xhci lacks this support. > usb core will call hcd->driver->endpoint_disable() callback before > disabling endpoints, so we want this in xhci as well. > > The added xhci_endpoint_disable() is based on ehci_endpoint_disable() > > Fixes: ef513be0a905 ("usb: xhci: Add Clear_TT_Buffer") > Cc: <stable@xxxxxxxxxxxxxxx> # v5.3 > Reported-by: Johan Hovold <johan@xxxxxxxxxx> > Signed-off-by: Mathias Nyman <mathias.nyman@xxxxxxxxxxxxxxx> > --- > drivers/usb/host/xhci.c | 43 +++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 43 insertions(+) > > diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c > index 5cfbf9a04494..6e817686d04f 100644 > --- a/drivers/usb/host/xhci.c > +++ b/drivers/usb/host/xhci.c > @@ -3071,6 +3071,48 @@ void xhci_cleanup_stalled_ring(struct xhci_hcd *xhci, unsigned int ep_index, > } > } > > +static void xhci_endpoint_disable(struct usb_hcd *hcd, > + struct usb_host_endpoint *host_ep) > +{ > + struct xhci_hcd *xhci; > + struct xhci_virt_device *vdev; > + struct xhci_virt_ep *ep; > + struct usb_device *udev; > + unsigned long flags; > + unsigned int ep_index; > + > + xhci = hcd_to_xhci(hcd); > +rescan: > + spin_lock_irqsave(&xhci->lock, flags); > + > + udev = (struct usb_device *)host_ep->hcpriv; > + if (!udev || !udev->slot_id) > + goto done; > + > + vdev = xhci->devs[udev->slot_id]; > + if (!vdev) > + goto done; > + > + ep_index = xhci_get_endpoint_index(&host_ep->desc); > + ep = &vdev->eps[ep_index]; > + if (!ep) > + goto done; > + > + /* wait for hub_tt_work to finish clearing hub TT */ > + if (ep->ep_state & EP_CLEARING_TT) { > + spin_unlock_irqrestore(&xhci->lock, flags); > + schedule_timeout_uninterruptible(1); > + goto rescan; > + } > + > + if (ep->ep_state) > + xhci_dbg(xhci, "endpoint disable with ep_state 0x%x\n", > + ep->ep_state); > +done: > + host_ep->hcpriv = NULL; > + spin_unlock_irqrestore(&xhci->lock, flags); > +} > + I used essentially the same reproducer as you did for debugging this after I first hit it with an actually stalled control endpoint, and this patch works also with my fault-injection hack. I've reviewed the code and it looks good to me except for one mostly theoretical issue. You need to check ep->hc_priv while holding the xhci->lock in xhci_clear_tt_buffer_complete() or you could end up having xhci_endpoint_disable() reschedule indefinitely while waiting for EP_CLEARING_TT to be cleared on a sufficiently weakly ordered system. Since cfbb8a84c2d2 ("xhci: Fix NULL pointer dereference in xhci_clear_tt_buffer_complete()") isn't needed anymore and is slightly misleading, I suggest amending the patch with the following: diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c index 9b1e15fe2c8e..6c17e3fe181a 100644 --- a/drivers/usb/host/xhci.c +++ b/drivers/usb/host/xhci.c @@ -5280,20 +5280,13 @@ static void xhci_clear_tt_buffer_complete(struct usb_hcd *hcd, unsigned int ep_index; unsigned long flags; - /* - * udev might be NULL if tt buffer is cleared during a failed device - * enumeration due to a halted control endpoint. Usb core might - * have allocated a new udev for the next enumeration attempt. - */ - xhci = hcd_to_xhci(hcd); + + spin_lock_irqsave(&xhci->lock, flags); udev = (struct usb_device *)ep->hcpriv; - if (!udev) - return; slot_id = udev->slot_id; ep_index = xhci_get_endpoint_index(&ep->desc); - spin_lock_irqsave(&xhci->lock, flags); xhci->devs[slot_id]->eps[ep_index].ep_state &= ~EP_CLEARING_TT; xhci_ring_doorbell_for_active_rings(xhci, slot_id, ep_index); spin_unlock_irqrestore(&xhci->lock, flags); Feel free to add my: Suggested-by: Johan Hovold <johan@xxxxxxxxxx> Reviewed-by: Johan Hovold <johan@xxxxxxxxxx> Tested-by: Johan Hovold <johan@xxxxxxxxxx> Johan