Re: usb: gadget: net2280: Fix function net2280_dequeue()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 27 Feb 2019 guido@xxxxxxxxxxxxxxxxxx wrote:

> Zitat von Alan Stern <stern@xxxxxxxxxxxxxxxxxxx>:
> 
> > On Tue, 19 Feb 2019 guido@xxxxxxxxxxxxxxxxxx wrote:
> >
> >> Hi Alan,
> >>
> >> My last proposal "udc: net2280: Fix overrun of OUT messages" is still
> >> under investigation.
> >>
> >> During the random stress tests I found a new rare problem:
> >>
> >> When a request must be dequeued with net2280_dequeue() e.g. due
> >> to a device clear action and the same request is finished by the
> >> function scan_dma_completions():
> >>
> >> https://elixir.bootlin.com/linux/v5.0-rc7/source/drivers/usb/gadget/udc/net2280.c#L1269
> >>
> >> then the following search loop does not find the request and the function
> >> returns the error -EINVAL without restoring the state ep->stopped = stopped!
> >>
> >> https://elixir.bootlin.com/linux/v5.0-rc7/source/drivers/usb/gadget/udc/net2280.c#L1269
> >>
> >> Thus the endpoint keeps blocked and does not receive any data.
> >> When we insert ep->stopped = stopped here:
> >>
> >> 	if (&req->req != _req) {
> >> 		ep->stopped = stopped; // <<<<<----
> >> 		spin_unlock_irqrestore(&ep->dev->lock, flags);
> >> 		dev_err(&ep->dev->pdev->dev, "%s: Request mismatch\n",
> >> 								__func__);
> >> 		return -EINVAL;
> >> 	}
> >>
> >> then the driver continues as expected, although the driver issues the
> >> false error message "Request mismatch".
> >>
> >> What do you think? It's labor-intensive to fix the error message here,
> >> or shall we leave it as it is, and only insert the "ep->stopped = stopped;"
> >
> > Yes, this is a bug.  I think the dev_err() should be changed to
> > ep_dbg().  It's not a real error, after all -- it's just a race.
> 
> I agree. ep_dbg() is better here. I see the same bug in the sibling driver
> net2272_dequeue():
> https://elixir.bootlin.com/linux/v5.0-rc8/source/drivers/usb/gadget/udc/net2272.c#L948
> 
> The bug is quite obvious to me, however I'm not aware that we use this driver
> in any of our instruments. I cannot test it. What do you recommend?
> - Shall I fix it within the same commit.
> - Shall I fix it with another commit.
> - Don't care about it

You can fix it in a separate commit, and in the commit log say that it 
was only compile-tested because you don't have the right hardware for a 
full test.

> > Do that and insert the "ep->stopped = stopped;" line.  Also, consider
> > changing the -EINVAL return code here to -EIDRM.  This is analogous to
> > what usb_unlink_urb() does on the host side.
> 
> EIDRM sounds better. However I see that all udc drivers calling the
> <udc>_dequeue() function returns -EINVAL when a request cannot be dequeued.
> I do not dare to break this tradition.

I guess it's too late to change this now.  A pity...

Alan Stern

> BTW we tested my last proposal:
> https://patchwork.kernel.org/patch/10796161/
> 
> ... I see that we can simplify this patch when we just insert the
> stop_out_nacking(ep) between line 909 and 910:
> 
> https://elixir.bootlin.com/linux/v5.0-rc5/source/drivers/usb/gadget/udc/net2280.c#L909
> 
> and remove the stop_out_naking() call in start_queue()...
> 
> I will add this improvement to the new patch sequence.
> 
> Guido




[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux