Hi, On 17 January 2017 at 18:39, Felipe Balbi <balbi@xxxxxxxxxx> wrote: > > Hi, > > Baolin Wang <baolin.wang@xxxxxxxxxx> writes: >>>>>>> Baolin Wang <baolin.wang@xxxxxxxxxx> writes: >>>>>>>> When handing the SETUP packet by composite_setup(), we will release the >>>>>>>> dwc->lock. If we get the 'USB_GADGET_DELAYED_STATUS' result from setup >>>>>>>> function, which means we need to delay handling the STATUS phase. >>>>>>> >>>>>>> this sentence needs a little work. Seems like it's missing some >>>>>>> information. >>>>>>> >>>>>>> anyway, I get that we release the lock but... >>>>>>> >>>>>>>> But during the lock release period, maybe the request for handling delay >>>>>>> >>>>>>> execution of ->setup() itself should be locked. I can see that it's only >>>>>>> locked for set_config() which is rather peculiar. >>>>>>> >>>>>>> What exact request you had when you triggered this? (Hint: dwc3 >>>>>>> tracepoints print out ctrl request bytes). IIRC, only set_config() or >>>>>>> f->set_alt() can actually return USB_GADGET_DELAYED_STATUS. >>>>>> >>>>>> Yes, when host set configuration for mass storage driver, it can >>>>>> produce this issue. >>>>>> >>>>>>> >>>>>>> Which gadget driver were you using when you triggered this? >>>>>> >>>>>> mass storage driver. When host issues the setting config request, we >>>>>> will get USB_GADGET_DELAYED_STATUS result from >>>>>> set_config()--->fsg_set_alt(). Then the mass storage driver will issue >>>>>> one thread to complete the status stage by ep0_queue() (this thread >>>>>> may be running on another core), then if the thread issues ep0_queue() >>>>>> too fast before we get the dwc->lock in dwc3_ep0_delegate_req() or >>>>>> before we get into the STATUS stage, then we can not handle this >>>>>> request for the delay STATUS stage in dwc3_gadget_ep0_queue(). >>>>>> >>>>>>> >>>>>>> Another point here is that the really robust way of fixing this is to >>>>>>> get rid of USB_GADGET_DELAYED_STATUS altogether and just make sure >>>>>>> gadget drivers know how to queue requests for all three phases of a >>>>>>> Control Transfer. >>>>>>> >>>>>>> A lot of code will be removed from all gadget drivers and UDC drivers >>>>>>> while combining all of it in a single place in composite.c. >>>>>>> >>>>>>> The reason I'm saying this is that other UDC drivers might have similar >>>>>>> races already but they just haven't triggered. >>>>>> >>>>>> Yes, maybe. >>>>>> >>>>>>> >>>>>>>> STATUS phase has been queued into list before we set 'dwc->delayed_status' >>>>>>>> flag or entering 'EP0_STATUS_PHASE' phase, then we will miss the chance >>>>>>>> to handle the STATUS phase. Thus we should check if the request for delay >>>>>>>> STATUS phase has been enqueued when entering 'EP0_STATUS_PHASE' phase in >>>>>>>> dwc3_ep0_xfernotready(), if so, we should handle it. >>>>>>>> >>>>>>>> Signed-off-by: Baolin Wang <baolin.wang@xxxxxxxxxx> >>>>>>>> --- >>>>>>>> drivers/usb/dwc3/ep0.c | 14 ++++++++++++++ >>>>>>>> 1 file changed, 14 insertions(+) >>>>>>>> >>>>>>>> diff --git a/drivers/usb/dwc3/ep0.c b/drivers/usb/dwc3/ep0.c >>>>>>>> index 9bb1f85..e689ced 100644 >>>>>>>> --- a/drivers/usb/dwc3/ep0.c >>>>>>>> +++ b/drivers/usb/dwc3/ep0.c >>>>>>>> @@ -1123,7 +1123,21 @@ static void dwc3_ep0_xfernotready(struct dwc3 *dwc, >>>>>>>> dwc->ep0state = EP0_STATUS_PHASE; >>>>>>>> >>>>>>>> if (dwc->delayed_status) { >>>>>>>> + struct dwc3_ep *dep = dwc->eps[0]; >>>>>>>> + >>>>>>>> WARN_ON_ONCE(event->endpoint_number != 1); >>>>>>>> + /* >>>>>>>> + * We should handle the delay STATUS phase here if the >>>>>>>> + * request for handling delay STATUS has been queued >>>>>>>> + * into the list. >>>>>>>> + */ >>>>>>>> + if (!list_empty(&dep->pending_list)) { >>>>>>>> + dwc->delayed_status = false; >>>>>>>> + usb_gadget_set_state(&dwc->gadget, >>>>>>>> + USB_STATE_CONFIGURED); >>>>>>> >>>>>>> Isn't this patch also changing the normal case when usb_ep_queue() comes >>>>>>> later? I guess list_empty() protects against that... >>>>>> >>>>>> I think it will not change other cases, we only handle the delayed >>>>>> status and I've tested it for a while and I did not find any problem. >>>>> >>>>> Alright, it's important enough to fix this bug. Can you also have a look >>>>> into dropping USB_GADGET_DELAYED_STATUS altogether? If you're too busy, >>>>> no issues. It'll stay in my queue. >>>> >>>> Okay, I will have a look at f_mass_storage driver to see if we can >>>> drop USB_GADGET_DELAYED_STATUS. Thanks. >>> >>> not only mass storage. It needs to be done for all drivers. The way to >>> do that is to teach functions that control transfers are composed of two >>> or three phases. If you look at UDC drivers today, they all have >>> peculiarities about control transfers to handle stuff that *maybe* >>> gadget drivers won't handle. >>> >>> What we should do here is make sure that *all* 3 phases always have a >>> matching usb_ep_queue() coming from the upper layers. Whether >>> composite.c or f_*.c handles it, that's an implementation detail. But >>> just to illustrate the problem, we should be able to get rid of >>> dwc3_ep0_out_start() and assume that the upper layer will call >>> usb_ep_queue() when it wants to receive a new SETUP packet. >>> >>> Likewise, we should be able to assume that STATUS phase will only start >>> based on a usb_ep_queue() call. That way we can remove >>> USB_GADGET_DELAYED_STATUS altogether, because that will *always* be the >>> case. There will be no races to handle apart from the normal case where >>> XferNotReady can come before or after usb_ep_queue(), but we already >>> have proper handling for that too. >> >> Thanks to explain, but seems it need lots of work to convert these >> drivers, and I saw Alan had some concern about that. So I am not sure >> how to convert these drivers which can meet your requirements, maybe >> from adding one "wants_explicit_ctrl_phases" flag in struct >> usb_gadget as you suggested to start? > > yeah. Something like this: > > patch 1: add the new flag and document it > patch 2: implement usb_ep_queue() for data and status phases conditional > to that flag being set > patch 3: (e.g.) change dwc3 to rely on that behavior and pass the flag > to usb_gadget. > > this will be enough to actually test that the idea and implementation > works out. After that, just for_each_UDC_driver() port_patch_3(); Then > you add: > > patch N - 1: remove legacy code and force behavior of > wants_explicit_ctrl_phases > patch N: remove wants_explicit_ctrl_phases > > something along these lines. Okay, I will try to do that. Thanks. -- Baolin.wang Best Regards -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html