Re: [PATCH 2/3] udc: net2280: Fix overrun of OUT messages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Zitat von Alan Stern <stern@xxxxxxxxxxxxxxxxxxx>:

On Tue, 5 Feb 2019 guido@xxxxxxxxxxxxxxxxxx wrote:


Zitat von Alan Stern <stern@xxxxxxxxxxxxxxxxxxx>:

> On Mon, 4 Feb 2019, Guido Kiener wrote:
>
>> From: Guido Kiener <guido.kiener@xxxxxxxxxxxxxxxxx>
>>
>> The OUT endpoint normally blocks (NAK) subsequent packets when a
>> short packet is received and returns an incomplete queue entry to
>> the gadget driver. Thereby the gadget driver can detect
>> a short packet when reading queue entries with a length that is
>> not equal to a multiple of packet size.
>>
>> The start_queue() function enables receiving OUT packets regardless
>> of the content of the OUT FIFO. This results in a problem:
>> When receiving is enabled more OUT packets are appended to the last
>> short packet and the gadget driver cannot determine the end of a
>> short packet anymore. Furthermore the remaining data in the OUT FIFO
>> is not delivered to the gadget driver immediately and can produce
>> timeout errors.
>
> How can that happen?  When the short packet is received, the endpoint
> immediately starts sending NAKs, so no more data can enter the FIFO.
> Furthermore, the incomplete transfer is returned to the gadget driver,
> which removes the short packet from the FIFO.  So the FIFO has to be
> empty when start_queue() is called.

I don't remember the exact scenario. It happens something like this:
When you have only one request (e.g. 4kB) left in your endpoint OUT
queue and a host sends 4kB + some extra bytes, then your request (4kB)
is delivered to the gadget driver while the extra bytes (short packet)
are still in the OUT FIFO.
I try to reconstruct the situation (without fix) how the driver can call
(re)start_dma. I do not see a reasonable path in the moment.

Okay, I see.  If there's no outstanding request when the short packet
arrives, the data will remain in the FIFO.

But won't this situation be detected in start_dma(), which is the only
place where start_queue() is called?  start_dma() has code to test
specifically for a previous short OUT packet.  Should the fix go there
instead?

Alan,

I have inserted a bug detection and can now explain the race:

Ween receiving a long transfer the gadget driver queues a new request with
net2280_queue() and calls start_dma(). In the meantime the OUT FIFO is still
receiving data and the short packet is not detected yet. So the start_dma()
function does not use the shortcut in line:

https://elixir.bootlin.com/linux/v5.0-rc5/source/drivers/usb/gadget/udc/net2280.c#L905

Thus start_queue() is called at the end of the function which calls
stop_out_naking(). This will allow receiving new data, even when a short
packet is arrived in the meantime.
Indeed, I'm not sure now whether my patch only minimizes the race or
whether we need to rethink the logic here and call stop_out_naking() when
the request is really done and delivered to the gadget driver.
See function done().

Here is my code sample how I can detect the race (USB 2.0 with max packet
size of 512):

static void start_queue(struct net2280_ep *ep, u32 dmactl, u32 td_dma)
{
	struct net2280_dma_regs	__iomem *dma = ep->dma;
	unsigned int tmp = BIT(VALID_BIT) | (ep->is_in << DMA_DIRECTION);

	if (!(ep->dev->quirks & PLX_2280))
		tmp |= BIT(END_OF_CHAIN);
	ep_vdbg(ep->dev, "start_queue\n");

	writel(tmp, &dma->dmacount);
	writel(readl(&dma->dmastat), &dma->dmastat);

	writel(td_dma, &dma->dmadesc);
	if (ep->dev->quirks & PLX_PCIE)
		dmactl |= BIT(DMA_REQUEST_OUTSTANDING);
	writel(dmactl, &dma->dmactl);

	/* erratum 0116 workaround part 3:  pci arbiter away from net2280 */
	(void) readl(&ep->dev->pci->pcimstctl);

	// Show bug begin
	unsigned int avail = readl(&ep->regs->ep_avail);
	if (!ep->is_in && (avail % 512) != 0) {
		u32 t1, t2;
		t1 = readl(&ep->cfg->ep_cfg);
		t2 = readl(&ep->regs->ep_rsp) & 0xff;

		ep_err(ep->dev, " stat %08x avail %04x (ep%d%s-%s)%s\n",
			readl(&ep->regs->ep_stat),
			readl(&ep->regs->ep_avail),
			t1 & 0x0f, DIR_STRING(t1),
			type_string(t1 >> 8),
			ep->stopped ? "*" : "");
		WARN_ON(1);
	}
	// Show bug end

	writel(BIT(DMA_START), &dma->dmastat);

	//if (!ep->is_in && readl(&ep->regs->ep_avail) == 0) //<- patch
	if (!ep->is_in)
		stop_out_naking(ep);
}

This is the corresponding callstack.
Flag "NAK packets" (BIT4) is set in EP_STAT and there are now 944 (0x3b0)
bytes in the OUT FIFO.

net2280 0000:05:05.0:  stat 0001103a avail 03b0 (ep1out-bulk)
------------[ cut here ]------------
WARNING: CPU: 0 PID: 32093 at net2280.c:915 start_dma+0x1b2/0x2c0 [net2280]()
Call Trace:
 dump_stack+0x63/0x90
 warn_slowpath_common+0x82/0xc0
 warn_slowpath_null+0x1a/0x20
 start_dma+0x1b2/0x2c0 [net2280]
 net2280_queue+0x247/0x390 [net2280]
 _enqueueBulkoutBuf+0xa8/0x1f0 [rsusbtmc]
 dev_read+0x33e/0x490 [rsusbtmc]
 ? security_file_permission+0xa0/0xc0
 __vfs_read+0x18/0x40
 vfs_read+0x86/0x130
 SyS_read+0x55/0xc0
 ? SyS_ioctl+0x63/0x90

What do you think?

Regards,

Guido





[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux