On Mon, 20 Aug 2018, Adrian Ambrożewicz wrote: > Hello, > > I'm consistently observing kernel warnings related to Mass Storage USB > Gadget de-initialization flow. After investigation I believe that I've > found root cause of these warnings, however I'm unable to estimate the > impact. I would like to put the issue into discussion about possible > side-effects. > > My usage model is to emulate file system using f_mass_storage gadget. > Whenever I pull the USB cable out I see warning related to > "dwc2_hsotg_init_fifo" which is commented in the following way: "/* > Reset fifo map if not correctly cleared during previous session */". > > Assumption I've made were the following: > 1) disconnect handler notifies all gadgets with call_gadget(.., disconnect) > 2) gadgets are then responsible to clear the ep fifos with usb_ep_disable() > 3) disconnect handler initializes the fifo maps to clear state > > Unfortunately in my case the warning occurs every time when using > f_mass_storage gadget. By comparison with f_hid gadget I've come up > with conclusion that it's caused by race condition in the way the > "disable" flow is implemented in mass storage. > > "Correct" flow implemented by HID gadget is the following: > * dwc2_hsotg_irq (USBRst) > ** dwc2_hsotg_disconnect > *** call_gadget(disconnect) > **** hidg_disable > ***** usb_ep_disable // fifos utilized by device are cleared in fifo_map > ** dwc_hsotg_core_init_disconnected > *** dwc2_hsotg_init_fifo // fifo_map is empty - no warning here > > By comparison here is flow observed in Mass Storage gadget. Race > condition is introduced by utilizing separate worker thread to handle > events: > THREAD #1 > * dwc2_hsotg_irq (USBRst) > ** dwc2_hsotg_disconnect > *** call_gadget(disconnect) > **** fsg_disable() > ***** raise_exception(FSG_STATE_CONFIG_CHANGE) // Race between Thread > #1 and Thread #2 starts here. > ** dwc_hsotg_core_init_disconnected > *** dwc2_hsotg_init_fifo // fifo_map is object of the race between > this function, and call to usb_ep_disable() > > THREAD #2 > * handle_exception(FSG_STATE_CONFIG_CHANGE) > ** do_set_interface(NULL) > *** usb_ep_disable() This seems like a bug in the dwc2_hsotg driver; it should include sufficient locking to handle races of this sort. A gadget's disconnect handler is not required to disable all endpoints. In fact it can't disable endpoints at all, because the disconnect handler may be called in interrupt context whereas usb_ep_disable() must be called in process context. As a result, usb_ep_disable() is expected to be called after the disconnect handler has returned. > My questions are the following: > - is this known issue? > - is it risky? What are possible outcomes? > - If this race condition is dangerous what would be proper fix ? > Should fsg_disable() call be blocked until handle_exception() finishes > the cleanup? I know that it's just a WA for "misaligned" Mass Storage > gadget architecture, but are there alternatives? I don't know the answers to these questions. The dwc2 maintainer probably has a better understanding of the issues. However, the f_mass_storage gadget architecture is not "misaligned". And given the current implementation of the composite framework, fsg_disable() cannot block. Alan Stern > I would really appreciate professional insight on that matter, > > Regards, > Adrian