Hi, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> writes: >> Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> writes: >> >> >> > On Thu, 29 Jun 2017, kernel test robot wrote: >> >> >> > >> >> >> >> FYI, we noticed the following commit: >> >> >> >> >> >> >> >> commit: f16443a034c7aa359ddf6f0f9bc40d01ca31faea ("USB: gadgetfs, dummy-hcd, net2280: fix locking for callbacks") >> >> >> >> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master >> >> >> >> >> >> >> >> in testcase: trinity >> >> >> >> with following parameters: >> >> >> >> >> >> >> >> runtime: 300s >> >> >> >> >> >> >> >> test-description: Trinity is a linux system call fuzz tester. >> >> >> >> test-url: http://codemonkey.org.uk/projects/trinity/ >> >> >> >> >> >> >> >> >> >> >> >> on test machine: qemu-system-x86_64 -enable-kvm -m 420M >> >> >> >> >> >> >> >> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace): >> >> >> > ... >> >> >> > >> >> >> > I won't include the entire report. The gist is that we have a problem >> >> >> > with lock ordering. The report is about dummy-hcd, but this could >> >> >> > affect any UDC driver. >> >> >> > >> >> >> > 1. When a SETUP request arrives, composite_setup() acquires >> >> >> > cdev->lock before calling the function driver's callback. >> >> >> > When that callback submits a reply, it causes the UDC driver >> >> >> > to acquire its private lock. >> >> >> >> >> >> this only seems to happen before calling set_config(): >> >> >> >> >> >> case USB_REQ_SET_CONFIGURATION: >> >> >> if (ctrl->bRequestType != 0) >> >> >> goto unknown; >> >> >> if (gadget_is_otg(gadget)) { >> >> >> if (gadget->a_hnp_support) >> >> >> DBG(cdev, "HNP available\n"); >> >> >> else if (gadget->a_alt_hnp_support) >> >> >> DBG(cdev, "HNP on another port\n"); >> >> >> else >> >> >> VDBG(cdev, "HNP inactive\n"); >> >> >> } >> >> >> spin_lock(&cdev->lock); >> >> >> value = set_config(cdev, ctrl, w_value); >> >> >> spin_unlock(&cdev->lock); >> >> >> break; >> >> > >> >> > That's true. Why is the lock held for set_config() but not for any of >> >> > the other callbacks? >> >> >> >> this is really old code from Dave. Your guess is as good as mine :-( >> >> >> >> > Doesn't that mean the other callbacks can race with function >> >> > unregistration? >> >> >> >> Probably not as UDCs are required to cancel transfers and kill all >> >> endpoints before unregistering. We would probably just giveback a few >> >> requests with -ESHUTDOWN and prevent new ones from being queued to HW, >> >> no? >> > >> > But SETUP callbacks aren't associated with pending requests. They get >> >> they are if ->setup() returns DELAYED_STATUS. > > No. In the DELAYED_STATUS case, the gadget driver submits a request > _after_ the SETUP packet is received. > > In no case is there a pending request that an incoming SETUP packet is > associated with. Oh, I see now what you mean. You're talking *specifically* the SETUP stage :-) Yeah, I agree. Currently, we don't have a request associated with it. >> > generated whenever a SETUP packet is received, even if the gadget >> > driver has no requests queued. Cancelling transfers won't prevent >> > them. >> > >> > Killing all endpoints might or might not do the trick. Does killing >> > ep0 prevent the UDC driver from receiving SETUP packets? This may vary >> > between UDC drivers. >> >> no, it does not. If ->setup() fails, we are required to Stall and >> restart ep0. Restarting ep0 involves preparing it to receive a new >> SETUP request. > > I wasn't talking about stalling or restarting ep0; I was talking about > killing it. Right, there's no killing of ep0 unless we're talking about module removal. In case of ->setup() returning an error, we *must* stall and restart ep0 (reprogram DMA, prepare 8-byte buffer, etc). If we don't receive SETUP packet, then we don't even get a completion interrupt on ep0. >> > There are also the other callbacks (reset, disconnect, bus suspend, and >> > bus resume), which aren't associated with endpoints or requests at all. >> > >> > Probably drivers really should have something like synchronize_irq() in >> > the udc_stop routine. That would solve a lot of problems (although it >> > wouldn't help dummy_hcd, which doesn't rely on interrupts). >> >> Hmm, free_irq() calls synchronize_irq(). UDCs should just request_irq() >> on udc_start() and free_irq() on udc_stop(). That's what dwc3 is doing. > > In general, I agree. But free_irq() isn't enough; you also have to > tell the UDC hardware to stop generating interrupt requests. Otherwise > you'll end up with a "nobody cared!" error. Yeah, but that's also up to the UDC driver. It must guarantee that interrupts are masked before freeing the handler. This is not USB-specific in any way, right?: -) -- balbi -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html