Hi, I am Thomas who contacted Laurent regarding this issue. We have some
further observations:
Am 04.05.2017 um 08:45 schrieb Peter Chen:
On Wed, May 03, 2017 at 01:32:28PM +0300, Laurent Pinchart wrote:
There was no one reported this problem before, but from the description,
it seems an IC issue which is triggered at high loading memory bus,
controller may not get time to visit memory at limited time.
That's my guess too. I was expecting the USB controller's bus master interface
to get stalled but eventually perform the access (or retry it, I'm not sure
what kind of bus it sits on), but there might be a hardware bug that messes up
the controller's state machine. I won't rule out the possibility of a software
issue yet, it might be possible to detect this condition and retry the
transfer.
I am not sure if it can be recovered, you can call ->fifo_flush, and
->ep_disable and ->ep_enable if it returns -ETIME, and re-submit this
request.
At Xilinx Zynq, its tx buffer is small, and less than 512 bytes
(84bc70f94d81, "usb: chipidea: add xilinx zynq platform data"), and your
throughput may be > (512 * 3) bytes/SoF, you can't use non-stream mode by
reducing max packet size.
My throughput is actually 1*1024 bytes / SOF.
>From previous discussion, the tx fifo size is 341.33 bytes for xilinx
zynq, you can set max packet size as 341 and mult as 3, then you can
transfer 1023 bytes / SoF for non-stream mode, assumed the non-stream
mode can fix your problem.
I am not sure if this is totally correct: Xilinx UG585, page 1815,
register HWTXBUF (TXADD), the TxBuffer size of the Xilinx controller is
only 768 Bytes. However, I think you are correct that the issue is
related to this topic:
When we reduce the max packet size to 512, we no longer get the full
lock-ups, but UVC streaming still stops after some time (under have
memory traffic).
When we reduce the max packet size to 256, things appear to work stable
(but very slow).
When we use a max packet size of 720 we still see lock-ups (which is a
bit of surprise to me, I would have expected something close to 768 to
be the magical limit).
diff --git a/drivers/usb/chipidea/ci_hdrc_usb2.c b/drivers/usb/chipidea/ci_hdrc_usb2.c
index d162cc0..7229f2d 100644
--- a/drivers/usb/chipidea/ci_hdrc_usb2.c
+++ b/drivers/usb/chipidea/ci_hdrc_usb2.c
@@ -33,6 +33,7 @@ static const struct ci_hdrc_platform_data ci_default_pdata = {
static struct ci_hdrc_platform_data ci_zynq_pdata = {
.capoffset = DEF_CAPOFFSET,
+ .flags = CI_HDRC_DISABLE_STREAMING,
};
I think you may observe many under-run at bus analyzer during the test.
I'll try to get this checked (as I don't have a USB analyzer here).
As a workaround, you may try to do below things:
1. Link more TDs before the UVC run
Do you mean I should have more requests queued to avoid hitting the
software/hardware race that the loop handles ?
Yes, you can comment out that code if you have more pending dTDs.
2. Comment out the code, you are stuck in, it is only useful for protect
last td status which is handling or will be handled soon by hardware.
/*
do {
hw_write(ci, OP_USBCMD, USBCMD_ATDTW, USBCMD_ATDTW);
tmp_stat = hw_read(ci, OP_ENDPTSTAT, BIT(n));
} while (!hw_read(ci, OP_USBCMD, USBCMD_ATDTW));
hw_write(ci, OP_USBCMD, USBCMD_ATDTW, 0);
if (tmp_stat)
goto done;
*/
I'm not familiar with this driver, but if I understand things correctly, the
code (and the "TD tripwire" hardware feature) is here to handle a race
condition where the hardware could look at the last TD's next pointer at the
same time it gets updated by the software. If I comment the code out, won't
the endpoint will be primed unconditionally, even if it's currently executing
transfers ? Won't that be a problem ?
Try below code, I have tested it using qmult=20 for NCM.
diff --git a/drivers/usb/chipidea/udc.c b/drivers/usb/chipidea/udc.c
index d68b125..dbef51a 100644
--- a/drivers/usb/chipidea/udc.c
+++ b/drivers/usb/chipidea/udc.c
@@ -485,13 +485,19 @@ static int _hardware_enqueue(struct ci_hw_ep *hwep, struct ci_hw_req *hwreq)
wmb();
if (hw_read(ci, OP_ENDPTPRIME, BIT(n)))
goto done;
- do {
- hw_write(ci, OP_USBCMD, USBCMD_ATDTW, USBCMD_ATDTW);
- tmp_stat = hw_read(ci, OP_ENDPTSTAT, BIT(n));
- } while (!hw_read(ci, OP_USBCMD, USBCMD_ATDTW));
- hw_write(ci, OP_USBCMD, USBCMD_ATDTW, 0);
- if (tmp_stat)
- goto done;
+
+ if (ci->ep0out == hwep || ci->ep0in == hwep) {
+ do {
+ hw_write(ci, OP_USBCMD, USBCMD_ATDTW, USBCMD_ATDTW);
+ tmp_stat = hw_read(ci, OP_ENDPTSTAT, BIT(n));
+ } while (!hw_read(ci, OP_USBCMD, USBCMD_ATDTW));
+ hw_write(ci, OP_USBCMD, USBCMD_ATDTW, 0);
+ if (tmp_stat)
+ goto done;
+ } else {
+ if(hw_read(ci, OP_ENDPTSTAT, BIT(n)))
+ goto done;
+ }
}
/* QH configuration */
3. If it can let your test run more time, try change code like below:
if (remaining_td_num > 2)
don't do hardware check;
else
do hardware check.
I'll test that, but I assume I'm hitting the problem when the number of
remaining TDs is exactly one.
Let's not meet this situation, just queue before TDs before running.
Unfortunately we do not understand enough yet of the drivers (chipidea
and uvc) to test this, maybe Laurent did already.
Besides, I can try your test if you could show me the detail test steps.
I need to create an easy test case, at the moment it's a complex application
and not every part of it can be published. I'll try to provide an open-source
test case implementation.
I only have nxp/fsl board which imx6 series SoC is on it.
We had considered to make a test case based on the Zybo board. However,
we have big time pressure on the project and I think we will concentrate
on a bulk transfer implementation, as I have big doubts in the meantime
if it is possible at all to get realiable and fast isochronuos USB
transfer with Zynq in our situation (with a lot of memory traffic - and
we have only 16b external bus interface).
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html