Hi, Danilo Krummrich <danilokrummrich@xxxxxxxxxxxxx> writes: > thanks for reviewing. np :-) > On 2017-08-15 12:03, Felipe Balbi wrote: >> Hi, >> >> Danilo Krummrich <danilokrummrich@xxxxxxxxxxxxx> writes: >>> udc_stop needs to be called before gadget driver unbind. Otherwise it >>> might happen that udc drivers still call into the gadget driver (e.g. >>> to reset gadget after OTG event). If this happens it is likely to get >>> panics from gadget driver dereferencing NULL ptr, as gadget's drvdata >>> is set to NULL on unbind. >> >> seems like the problem here is with the OTG layer, not UDC core. >> > I mentioned this just as example, it can happen whenever a UDC driver > calls > the gadget driver (e.g. by calling usb_gadget_udc_reset() in ISR) after > gadget > drivers unbind() was called already (e.g. by gadget configfs). > If this happens gadget drivers drvdata was already set to NULL by > unbind() > and reset() could result into a NULL ptr exception. > Therefore my assumption was that it needs to be prevented that the > gadget > driver is getting called after unbind. We have a known problem in the design of the gadget API that causes this races but we couldn't come up with a solution yet :-) Inverting these two calls is not the correct way to go about this :-) >>> Signed-off-by: Danilo Krummrich <danilokrummrich@xxxxxxxxxxxxx> >>> --- >>> Actually there could still be a race: >>> (CPU1 code taken from dwc3 drivers dwc3_disconnect_gadget() as >>> exsample) >>> >>> CPU0 CPU1 >>> ---- ---- >>> usb_gadget_disconnect(udc->gadget); >>> udc->driver->disconnect(udc->gadget); >>> if (dwc->gadget_driver && dwc->gadget_driver->disconnect) >>> usb_gadget_udc_stop(udc); >>> udc->driver->unbind(udc->gadget); >>> dwc->gadget_driver->disconnect(&dwc->gadget); >>> >>> UDC drivers typically set their gadget driver pointer to NULL in >>> udc_stop >>> and check for it before calling into the gadget driver. To fix the >>> issue >>> above every udc driver could apply a lock around this. >>> >>> If you see the need for having this or another solutions I can provide >>> further patches. This patch could also just serve as a base for >>> discussion >>> if someone knows a smarter solution. >>> >>> I saw this problem causing a panic on hikey960 board and provided a >>> quick >>> workaround for the same problem here: >>> https://android-review.googlesource.com/#/c/kernel/common/+/457476/ >>> (panic log in the commit message of the linked patch) >>> --- >>> drivers/usb/gadget/udc/core.c | 6 +++++- >>> 1 file changed, 5 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/usb/gadget/udc/core.c >>> b/drivers/usb/gadget/udc/core.c >>> index efce68e9a8e0..8155468afc0d 100644 >>> --- a/drivers/usb/gadget/udc/core.c >>> +++ b/drivers/usb/gadget/udc/core.c >>> @@ -1234,8 +1234,12 @@ static void usb_gadget_remove_driver(struct >>> usb_udc *udc) >>> >>> usb_gadget_disconnect(udc->gadget); >>> udc->driver->disconnect(udc->gadget); >>> - udc->driver->unbind(udc->gadget); >>> + /* udc_stop needs to be called before gadget driver unbind to >>> prevent >>> + * udc driver calls into gadget driver after unbind which could >>> cause >>> + * a nullptr exception. >>> + */ >>> usb_gadget_udc_stop(udc); >>> + udc->driver->unbind(udc->gadget); >> >> This patch is incorrect, it will prevent us from giving back requests >> to >> gadget driver properly. ->unbind() has to happen before ->udc_stop(). > > Do you mean after udc_stop the udc driver can not call the gadget driver > anymore? If not, I did not got your point, sorry for that. Can you > please > help me out? Would the changed order raise another issue I'm not aware > of? right, ->udc_stop() is supposed to completely teardown the USB controller, including disabling interrupts and all. The only thing it _can_ do from ->udc_stop() would be giving back any pending requests that were left (which would cause req->complete() to be called with an error status). But even that is unlikely in the case you mention since ->unbind() was already called. > If I understood you correctly, without this patch udc driver can not > call > the gadget driver back as well, because this would result in a NULL ptr > dereference, as unbind() sets drvdata to NULL. > > In any case the race described in my original message can still happen, > regardless of the order of udc_stop and unbind. But with this patch the > needed locking could easily done within the udc driver only. Without, > the > lock needs to be acquired before udc->driver->unbind(udc->gadget) and > released after usb_gadget_udc_stop(). Otherwise an ISR of the udc driver > trying to call into the gadget driver could do this after gadget driver > already unbound. right -- balbi
Attachment:
signature.asc
Description: PGP signature