Re: Possible race condition in f_mass_storage gadget during deinitialization. Kernel warning issued

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Adrian,

On 9/10/2018 11:54 AM, Adrian Ambrożewicz wrote:
> Thanks for the patch. I was happy to blindly apply it on my sources and 
> check the result. Unfortunately it ended up with even worse crash than 
> before.
> 
> I've noticed that uninitialized stack variable "flags" is passed 
> to  spin_unlock_irqrestore . I've tried to tweak the patch and pass 
> "flags" from every calling function, but there exists at least one 
> calling context in which flags are not available -  dwc2_hsotg_irq .
> 
> Another attempt I've made was to try to retrieve something sensible to 
> pass to spin_unlock_irqrestore(&hsotg->lock, flags); by 
> calling local_irq_save(flags); . I'm not that experienced and I can't 
> determine what side effects this change may introduce but at least it 
> got rid of the panics.
> 
> After applying this adjustment it looks like originally reported warning 
> is gone.
> 
> I will be happy to test another version of the patch if you find it 
> necessary to pursue that way.
> 
> Regards,
> Adrian
> 
> czw., 6 wrz 2018 o 14:22 użytkownik Minas Harutyunyan 
> <Minas.Harutyunyan@xxxxxxxxxxxx <mailto:Minas.Harutyunyan@xxxxxxxxxxxx>> 
> napisał:
> 
>     Hi Adrian,
> 
>     On 8/20/2018 4:34 PM, Adrian Ambrożewicz wrote:
>      > In my current workspace the kernel used is 4.17.7 .
>      >
>      > Unfortunately I don't have the resources now to verify with newer
>      > version but I might look into that later if it's necessary. I've only
>      > compared source code between my worktree and mainline sources and
>      > verified that code around this area still looks almost exactly the
>      > same (with minor changes like renames of API calls here and there).
>      >
>      > Looking forward for Minas opinion.
>      > pon., 20 sie 2018 o 13:20 Felipe Balbi
>     <felipe.balbi@xxxxxxxxxxxxxxx <mailto:felipe.balbi@xxxxxxxxxxxxxxx>>
>      > napisał(a):
>      >>
>      >>
>      >> Hi,
>      >>
>      >> Adrian Ambrożewicz <adrian.ambrozewicz@xxxxxxxxx
>     <mailto:adrian.ambrozewicz@xxxxxxxxx>> writes:
>      >>> Hello,
>      >>>
>      >>> I'm consistently observing kernel warnings related to Mass
>     Storage USB
>      >>> Gadget de-initialization flow. After investigation I believe
>     that I've
>      >>> found root cause of these warnings, however I'm unable to
>     estimate the
>      >>> impact. I would like to put the issue into discussion about
>     possible
>      >>> side-effects.
>      >>>
>      >>> My usage model is to emulate file system using f_mass_storage
>     gadget.
>      >>> Whenever I pull the USB cable out I see warning related to
>      >>> "dwc2_hsotg_init_fifo" which is commented in the following way: "/*
>      >>> Reset fifo map if not correctly cleared during previous session
>     */".
>      >>
>      >> which kernel are you using? Have you tried latest mainline?
>     (currently
>      >> at v4.18).
>      >>
>      >>> Assumption I've made were the following:
>      >>> 1) disconnect handler notifies all gadgets with call_gadget(..,
>     disconnect)
>      >>> 2) gadgets are then responsible to clear the ep fifos with
>     usb_ep_disable()
>      >>> 3) disconnect handler initializes the fifo maps to clear state
>      >>>
>      >>> Unfortunately in my case the warning occurs every time when using
>      >>> f_mass_storage gadget. By comparison with f_hid gadget I've come up
>      >>> with conclusion that it's caused by race condition in the way the
>      >>> "disable" flow is implemented in mass storage.
>      >>>
>      >>> "Correct" flow implemented by HID gadget is the following:
>      >>> * dwc2_hsotg_irq (USBRst)
>      >>> ** dwc2_hsotg_disconnect
>      >>> *** call_gadget(disconnect)
>      >>> **** hidg_disable
>      >>> ***** usb_ep_disable // fifos utilized by device are cleared in
>     fifo_map
>      >>> ** dwc_hsotg_core_init_disconnected
>      >>> *** dwc2_hsotg_init_fifo // fifo_map is empty - no warning here
>      >>>
>      >>> By comparison here is flow observed in Mass Storage gadget. Race
>      >>> condition is introduced by utilizing separate worker thread to
>     handle
>      >>> events:
>      >>> THREAD #1
>      >>> * dwc2_hsotg_irq (USBRst)
>      >>> ** dwc2_hsotg_disconnect
>      >>> *** call_gadget(disconnect)
>      >>> **** fsg_disable()
>      >>> ***** raise_exception(FSG_STATE_CONFIG_CHANGE) // Race between
>     Thread
>      >>> #1 and Thread #2 starts here.
>      >>> ** dwc_hsotg_core_init_disconnected
>      >>> *** dwc2_hsotg_init_fifo // fifo_map is object of the race between
>      >>> this function, and call to usb_ep_disable()
>      >>>
>      >>> THREAD #2
>      >>> * handle_exception(FSG_STATE_CONFIG_CHANGE)
>      >>> ** do_set_interface(NULL)
>      >>> *** usb_ep_disable()
>      >>>
>      >>> My questions are the following:
>      >>> - is this known issue?
>      >>> - is it risky? What are possible outcomes?
>      >>> - If this race condition is dangerous what would be proper fix ?
>      >>> Should fsg_disable() call be blocked until handle_exception()
>     finishes
>      >>> the cleanup? I know that it's just a WA for "misaligned" Mass
>     Storage
>      >>> gadget architecture, but are there alternatives?
>      >>>
>      >>> I would really appreciate professional insight on that matter,
>      >>
>      >> Let's see if Minas has anything to say about this.
>      >>
>      >> --
>      >> balbi
>      >
> 
>     Could you please apply this patch and test again.
> 
>     diff --git a/drivers/usb/dwc2/gadget.c b/drivers/usb/dwc2/gadget.c
>     index 403e99026c52..2bc924c55488 100644
>     --- a/drivers/usb/dwc2/gadget.c
>     +++ b/drivers/usb/dwc2/gadget.c
>     @@ -3171,6 +3171,8 @@ static void dwc2_hsotg_irq_fifoempty(struct
>     dwc2_hsotg *hsotg, bool periodic)
>                              GINTSTS_PTXFEMP |  \
>                              GINTSTS_RXFLVL)
> 
>     +static int dwc2_hsotg_ep_disable(struct usb_ep *ep);
>     +
>        /**
>         * dwc2_hsotg_core_init - issue softreset to the core
>         * @hsotg: The device state
>     @@ -3184,13 +3186,27 @@ void dwc2_hsotg_core_init_disconnected(struct
>     dwc2_hsotg *hsotg,
>              u32 val;
>              u32 usbcfg;
>              u32 dcfg = 0;
>     +       unsigned long flags;
>     +       int ep;
> 
>              /* Kill any ep0 requests as controller will be reinitialized */
>              kill_all_requests(hsotg, hsotg->eps_out[0], -ECONNRESET);
> 
>     -       if (!is_usb_reset)
>     +       if (!is_usb_reset) {
>                      if (dwc2_core_reset(hsotg, true))
>                              return;
>     +       }
>     +       else {
>     +               /* all endpoints should be shutdown */
>     +               spin_unlock_irqrestore(&hsotg->lock, flags);
>     +               for (ep = 1; ep < hsotg->num_of_eps; ep++) {
>     +                       if (hsotg->eps_in[ep])
>     +                             
>       dwc2_hsotg_ep_disable(&hsotg->eps_in[ep]->ep);
>     +                       if (hsotg->eps_out[ep])
>     +                             
>       dwc2_hsotg_ep_disable(&hsotg->eps_out[ep]->ep);
>     +               }
>     +               spin_lock_irqsave(&hsotg->lock, flags);
>     +       }
> 
>              /*
>               * we must now enable ep0 ready for host detection and then
> 
> 
> 
> 
>     Thanks,
>     Minas
> 

Could you please apply and test the patch "[PATCH] usb: dwc2: Disable 
all EP's on disconnect".

Thanks,
Minas




[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux