Hello, On Thursday, February 23, 2012 10:53 AM Felipe Balbi wrote: > On Thu, Feb 16, 2012 at 11:29:45AM -0800, Michal Nazarewicz wrote: > > On Wed, 15 Feb 2012 00:03:26 -0800, Felipe Balbi <balbi@xxxxxx> wrote: > > > > >On Tue, Feb 14, 2012 at 10:10:12PM -0800, Michal Nazarewicz wrote: > > >>>On Fri, Feb 10, 2012 at 09:54:51AM +0100, Lukasz Majewski wrote: > > >>>>It is crucial to assign each req->context value to struct rndis. > > >>>> > > >>>>The problem happens for multi function gadget (g_multi) when multiple > > >>>>functions are calling common usb_composite_dev control request. > > >>>> > > >>>>It might happen that *_setup method from one usb function will > > >>>>alter some fields of this common request issued by other USB > > >>>>function. > > >>>this is a bit weird, why would a function change a request which isn't > > >>>supposed to handled by it ? I mean, functions need to check the ctrl > > >>>request to be sure they should be handling it. > > >> > > >>It's composite layer who directs requests to functions. Functions do > > >>not need to check the request. > > > > > >well, composite does call into functions correctly, but functions need > > >to see if the control request is valid for them still. > > > > Once composite calls given function, the request is valid for them until > > its completed or aborted. > > What I mean is that host could be sending a request with the correct > RECIP_INTERFACE but wrong ctrl->bRequest. > > > >>>Your patch shouldn't be necessary, really, because the context is > > >>>initialized when the request is allocated. > > >> > > >>Yes, but there is only one request for EP0 which is shared between > > >>all the functions. > > > > > >true. > > > > > >>>what happens exactly ? Who changes the context field and why ? > > >> > > >>The context is changed by all other functions. > > > > > >I didn't see that, actually, look at f_acm: > > > > [...] > > > > >or f_eem: > > > > [...] > > > > >or f_ecm: > > > > [...] > > > > >or f_audio: > > > > [...] > > > > >or f_hid: > > > > [...] > > > > >or f_sourcesink: > > > > [...] > > > > >or f_uvc: > > > > [...] > > > > That's because those functions do not set req->complete. > > > > All functions that care when the request is finished and set req->complete > > will probably need to set req->context as well (unless they don't use > > req->context but that may lead to some global variables, yada yada). > > > > For all the other functions, which do not send req->complete, the request > > will get handled by composite_setup_complete() which does not care about > > context (as it only prints a message). > > > > >For all I know, the buggy gadget drivers are f_ncm, f_mass_storage and > > >f_fs because they change a request which *does not* belong to them. > > >The real fix would be to sort that out in another way. The ep0 request > > >always belongs to composite.c so functions can't go around poking with > > >that request. > > > > Of course they can. If I lend you my (hypothetical) car, I will let you > > adjust the seat. Same here, composite may own the usb_ctrlrequest structure, > > but by calling f->setup() it lends it to the function which can then adjust > > it in ways it needs. Until request is completed or aborted, the function is > > the only one that can use the request. > > > > >Maybe the best solution would be to allow ep0 to have multiple requests > > >queued, then functions can allocate and maintain their own ep0 request for > > >cases when they need to send anyting to the host, maybe. > > > > Maybe I'm missing something, but as far as I know, when function wants to send > > data to host on EP0, host needs to initialise the transfer. This results in > > composite_setup() being called, and until this request is finished (or aborted), > > there will be no other control requests sent on EP0. > > On all other endpoints we allow multiple requests to be queued to the > endpoint and each of them will be handled in sequence. Maybe we need to > have the same for the control pipe, so different functions can queue > their own stuff to ep0. > > Allowing for functions to change req->context will likely cause more > issues if we decide to shuffle things around and change how composite > handles requests. > > Besides, some of req->context changes could be done in a different way. > In f_ncm.c for instance, it receives the Setup phase by using > composite.c's request, but then it changes the setup phase request and > requeues it to receive the data phase, and does nothing with the Status > phase. > > This is one of the places which causes the most confusion and the most > problems when writing UDC drivers for linux. There's very little > "standardization" on who/how should Setup/Data/Status phases be handled > on the framework. It's pretty much a trial and error approach. > > I say again that the best solution would be to allow functions to queue > their own requests to ep0, rather than re-using composite.c's request. > That way, we can build into the framework the notion that composite.c's > request is for Setup phase ONLY and function is required to queue > optinal Data and status phases. So a request such as SetNtbInputSize > would require two usb_ep_queue from f_ncm, one for receiving the Data > phase and one for sending Status phase, as a bonus we avoid the bug you > mentioned on your original email. The problem is even worse. Receiving setup phase has nothing to do with a request - it is done internally by the udc driver without any request enqueued from the gadget (or function). Then, udc driver calls gadgets setup() method, which in turn might call setup() method for the respective usb composite function driver. All the setup data is passed in the argument, there is no request associated directly with it (although the udc driver might use some internal request, like s3c-hsotg.c). Then, gadget queues request as an answer for the setup() method - it might enqueue it directly from the setup() method or use delay it and queue from the worker or thread if some synchronization is required (like mass storage). This request usually corresponds to data phase. However there is a special case. If there is no data phase, gadget enqueues a ZLP request (which corresponds to status phase in fact!). In both cases the gadget has finished its job. It is up to the UDC driver to correctly handle the Status phase (again with some kind of internal request or so). This IS quite ugly and definitely not well documented. We gathered this knowledge from observing various usb functions (rndis, mass storage, acm, ecm and others) and analyzing their code. Luckily only a few usb functions use data phase for their control requests, so they just work even if the udc drivers are buggy. Correct handling of data requests which are a multiple of max packet size reveals another level of possible bugs in the udc drivers. Getting back to composite framework and it's ep0 request. It is only used to enqueue the reply for the gadget setup() method. Usually composite core use it to return the usb descriptors or strings, but I really see no problem if it is shared between composite core and the usb functions. The only requirement is to correctly reinitialize req->complete and req->context entries before enqueueing the request, what is already done by the usb function drivers and composite core. rndis was just buggy by missing of that assignment in one of the place and relied on the fact that the value has not been changed since the reply for the previous setup request (handled by the other function, which sets req->context correctly). I agree that we need to document this behavior better, but right now this patch is really needed to fix an ugly race which ends with kernel oops. Best regards -- Marek Szyprowski Samsung Poland R&D Center -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html