Re: usb: gadget: net2280: Fix function net2280_dequeue()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Zitat von Alan Stern <stern@xxxxxxxxxxxxxxxxxxx>:

On Tue, 19 Feb 2019 guido@xxxxxxxxxxxxxxxxxx wrote:

Hi Alan,

My last proposal "udc: net2280: Fix overrun of OUT messages" is still
under investigation.

During the random stress tests I found a new rare problem:

When a request must be dequeued with net2280_dequeue() e.g. due
to a device clear action and the same request is finished by the
function scan_dma_completions():

https://elixir.bootlin.com/linux/v5.0-rc7/source/drivers/usb/gadget/udc/net2280.c#L1269

then the following search loop does not find the request and the function
returns the error -EINVAL without restoring the state ep->stopped = stopped!

https://elixir.bootlin.com/linux/v5.0-rc7/source/drivers/usb/gadget/udc/net2280.c#L1269

Thus the endpoint keeps blocked and does not receive any data.
When we insert ep->stopped = stopped here:

	if (&req->req != _req) {
		ep->stopped = stopped; // <<<<<----
		spin_unlock_irqrestore(&ep->dev->lock, flags);
		dev_err(&ep->dev->pdev->dev, "%s: Request mismatch\n",
								__func__);
		return -EINVAL;
	}

then the driver continues as expected, although the driver issues the
false error message "Request mismatch".

What do you think? It's labor-intensive to fix the error message here,
or shall we leave it as it is, and only insert the "ep->stopped = stopped;"

Yes, this is a bug.  I think the dev_err() should be changed to
ep_dbg().  It's not a real error, after all -- it's just a race.

I agree. ep_dbg() is better here. I see the same bug in the sibling driver
net2272_dequeue():
https://elixir.bootlin.com/linux/v5.0-rc8/source/drivers/usb/gadget/udc/net2272.c#L948

The bug is quite obvious to me, however I'm not aware that we use this driver
in any of our instruments. I cannot test it. What do you recommend?
- Shall I fix it within the same commit.
- Shall I fix it with another commit.
- Don't care about it

Do that and insert the "ep->stopped = stopped;" line.  Also, consider
changing the -EINVAL return code here to -EIDRM.  This is analogous to
what usb_unlink_urb() does on the host side.

EIDRM sounds better. However I see that all udc drivers calling the
<udc>_dequeue() function returns -EINVAL when a request cannot be dequeued.
I do not dare to break this tradition.

BTW we tested my last proposal:
https://patchwork.kernel.org/patch/10796161/

... I see that we can simplify this patch when we just insert the
stop_out_nacking(ep) between line 909 and 910:

https://elixir.bootlin.com/linux/v5.0-rc5/source/drivers/usb/gadget/udc/net2280.c#L909

and remove the stop_out_naking() call in start_queue()...

I will add this improvement to the new patch sequence.

Guido







[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux