Hi Greg On Fri, Nov 17, 2023 at 9:53 PM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote: > > On Fri, Nov 17, 2023 at 03:21:28PM +0800, Kuen-Han Tsai wrote: > > The null pointer dereference happens when xhci_free_dev() frees the > > xhci->devs[slot_id] virtual device while xhci_urb_enqueue() is > > processing a urb and checking the max packet size. > > > > [106913.850735][ T2068] usb 2-1: USB disconnect, device number 2 > > [106913.856999][ T4618] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010 > > [106913.857488][ T4618] Call trace: > > [106913.857491][ T4618] xhci_check_maxpacket+0x30/0x2dc > > [106913.857494][ T4618] xhci_urb_enqueue+0x24c/0x47c > > [106913.857498][ T4618] usb_hcd_submit_urb+0x1f4/0xf34 > > [106913.857501][ T4618] usb_submit_urb+0x4b8/0x4fc > > [106913.857503][ T4618] usb_control_msg+0x144/0x238 > > [106913.857507][ T4618] do_proc_control+0x1f0/0x5bc > > [106913.857509][ T4618] usbdev_ioctl+0xdd8/0x15a8 > > > > This patch adds a spinlock to the xhci_urb_enqueue function to make sure > > xhci_free_dev() and xhci_urb_enqueue() do not race and cause null > > pointer dereference. > > I thought we had a lock for this already, what changed to cause this to > start triggering now, all these years later? Right, there is a lock in place for xhci_urb_enqueue(), but it doesn't protect all code segments that use xhci->devs[slot_id] within the function. I couldn't identify any specific changes that might have introduced this issue. It's likely a long-standing potential problem that's difficult to trigger under normal situations. This issue happens when the USB enumeration process is complete, and a user space program submits a control request to the peripheral, but then the device is rapidly disconnected. I was able to reproduce this issue by introducing a 3-second delay within xhci_check_maxpacket() and disconnecting the peripheral while observing that the control request is being processed by xhci_check_maxpacket(). > > > > > Signed-off-by: Kuen-Han Tsai <khtsai@xxxxxxxxxx> > > What commit id does this fix? Should I include a "Fixes:" header even if this patch doesn't address a bug from a specific commit? > > > > --- > > drivers/usb/host/xhci.c | 38 ++++++++++++++++++++++++-------------- > > 1 file changed, 24 insertions(+), 14 deletions(-) > > > > diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c > > index 884b0898d9c9..e0766ebeff0e 100644 > > --- a/drivers/usb/host/xhci.c > > +++ b/drivers/usb/host/xhci.c > > @@ -1522,23 +1522,32 @@ static int xhci_urb_enqueue(struct usb_hcd *hcd, struct urb *urb, gfp_t mem_flag > > struct urb_priv *urb_priv; > > int num_tds; > > > > - if (!urb) > > - return -EINVAL; > > - ret = xhci_check_args(hcd, urb->dev, urb->ep, > > - true, true, __func__); > > - if (ret <= 0) > > - return ret ? ret : -EINVAL; > > + spin_lock_irqsave(&xhci->lock, flags); > > + > > + if (!urb) { > > + ret = -EINVAL; > > + goto done; > > + } > > Why does this have to be inside the lock? The urb can't change here, > can it? You're right, no need to place those inside the lock. I will move them out of the protection. > > > + > > + ret = xhci_check_args(hcd, urb->dev, urb->ep, true, true, __func__); > > + if (ret <= 0) { > > + ret = ret ? ret : -EINVAL; > > + goto done; > > + } > > > > slot_id = urb->dev->slot_id; > > ep_index = xhci_get_endpoint_index(&urb->ep->desc); > > ep_state = &xhci->devs[slot_id]->eps[ep_index].ep_state; > > > > - if (!HCD_HW_ACCESSIBLE(hcd)) > > - return -ESHUTDOWN; > > + if (!HCD_HW_ACCESSIBLE(hcd)) { > > + ret = -ESHUTDOWN; > > + goto done; > > Note, we now have completions, so all of this "goto done" doesn't need > to happen anymore. Not a complaint, just a suggestion for future > changes or this one, your choice. > I'm not familiar with the concept of 'completions'. Can you please provide some links or explanations to help me understand it? I use a 'goto done' statement because I follow this pattern seen in many previous commits. However, I'm willing to modify this approach if there's a more suitable alternative. Please forgive me if any of my questions seem overly basic. I'm still in the process of learning how to contribute to the kernel community. Thanks, Kuen-Han > thanks, > > greg k-h