[ +CC: Alan ] On Fri, Oct 04, 2019 at 02:59:33PM +0300, Mathias Nyman wrote: > udev stored in ep->hcpriv might be NULL if tt buffer is cleared > due to a halted control endpoint during device enumeration > > xhci_clear_tt_buffer_complete is called by hub_tt_work() once it's > scheduled, and by then usb core might have freed and allocated a > new udev for the next enumeration attempt. > > Fixes: ef513be0a905 ("usb: xhci: Add Clear_TT_Buffer") > Cc: <stable@xxxxxxxxxxxxxxx> # v5.3 > Reported-by: Johan Hovold <johan@xxxxxxxxxx> > Signed-off-by: Mathias Nyman <mathias.nyman@xxxxxxxxxxxxxxx> > --- > drivers/usb/host/xhci.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c > index 00f3804f7aa7..517ec3206f6e 100644 > --- a/drivers/usb/host/xhci.c > +++ b/drivers/usb/host/xhci.c > @@ -5238,8 +5238,16 @@ static void xhci_clear_tt_buffer_complete(struct usb_hcd *hcd, > unsigned int ep_index; > unsigned long flags; > > + /* > + * udev might be NULL if tt buffer is cleared during a failed device > + * enumeration due to a halted control endpoint. Usb core might > + * have allocated a new udev for the next enumeration attempt. > + */ > + > xhci = hcd_to_xhci(hcd); > udev = (struct usb_device *)ep->hcpriv; > + if (!udev) > + return; I didn't have time to look into this myself last week, or comment on the patch before Greg picked it up, but this clearly isn't the right fix. As your comment suggests, ep->hcpriv may indeed be NULL here if USB core have allocated a new udev. But this only happens after USB has freed the old usb_device and the new one happens to get the same address. Note that even the usb_host_endpoint itself (ep) has then been freed and reallocated since it is member of struct usb_device, and it is the use-after-free that needs fixing. I've even been able to trigger another NULL-deref in this function before a new udev has been allocated, due to the virt dev having been freed by xhci_free_dev as part of usb_release_dev: [ 19.627771] usb 2-2.4: unable to read config index 0 descriptor/start: -32 [ 19.627966] usb 2-2.4: chopping to 0 config(s) [ 19.628133] usb 2-2.4: can't read configurations, error -32 [ 19.629017] usb 2-2.4: usb_release_dev - udev = ffff930b14d82000 udev is freed here [ 19.629258] usb 2-2-port4: attempt power cycle [ 19.629461] xhci_clear_tt_buffer_complete - udev = ffff930b14d82000 use-after-free when tt work is scheduled (note than udev is non-NULL since udev hasn't been reallocated and initialised yet): [ 19.629643] xhci_clear_tt_buffer_complete - xhci->devs[4] = 0000000000000000 virt dev is NULL after having been freed by xhci_free_dev() [ 19.629876] BUG: kernel NULL pointer dereference, address: 0000000000000030 and is later dereferenced [ 19.630034] #PF: supervisor write access in kernel mode [ 19.630155] #PF: error_code(0x0002) - not-present page [ 19.630270] PGD 0 P4D 0 [ 19.630341] Oops: 0002 [#1] SMP [ 19.630425] CPU: 2 PID: 110 Comm: kworker/2:2 Not tainted 5.4.0-rc1 #28 [ 19.630572] Hardware name: /D34010WYK, BIOS WYLPT10H.86A.0051.2019.0322.1320 03/22/2019 [ 19.636141] Workqueue: events hub_tt_work [ 19.638125] RIP: 0010:xhci_clear_tt_buffer_complete.cold.69+0x9b/0xcd It seems the xhci clear-tt implementation was incomplete since it did not take care to wait for any ongoing work before disabling the endpoint. EHCI does this in ehci_endpoint_disable(), but xhci doesn't even implement that callback. As this may be something you could end up hitting in other paths as well, perhaps we should even consider reverting the offending commit pending a more complete implementation? > slot_id = udev->slot_id; > ep_index = xhci_get_endpoint_index(&ep->desc); For reference, my original report is here: https://lkml.kernel.org/r/20190930103107.GC13531@localhost Johan