On Thu, Dec 18, 2014 at 12:25:41PM +0530, Sanchayan Maity wrote: > On 12/18/2014 10:24 AM, Peter Chen wrote: > > On Wed, Dec 17, 2014 at 10:04:35AM +0530, Sanchayan Maity wrote: > >> On 12/17/2014 05:46 AM, Peter Chen wrote: > >>> On Tue, Dec 16, 2014 at 04:15:08PM +0530, Sanchayan Maity wrote: > >>>> On 12/16/2014 02:15 PM, Peter Chen wrote: > >>>>> On Tue, Dec 16, 2014 at 10:50:59AM +0530, Sanchayan Maity wrote: > >>>>>> On 12/16/2014 06:16 AM, Peter Chen wrote: > >>>>>>> On Mon, Dec 15, 2014 at 02:59:31PM +0530, Sanchayan Maity wrote: > >>>>>>>> Hello, > >>>>>>>> > >>>>>>>> On 12/15/2014 07:42 AM, Peter Chen wrote: > >>>>>>>>> On Fri, Dec 12, 2014 at 06:55:36PM +0530, Sanchayan Maity wrote: > >>>>>>>>>> Hello, > >>>>>>>>>> > >>>>>>>>>> On 12/12/2014 07:21 AM, Peter Chen wrote: > >>>>>>>>>>> On Thu, Dec 11, 2014 at 08:34:45AM -0600, Felipe Balbi wrote: > >>>>>>>>>>>> Hi, > >>>>>>>>>>>> > >>>>>>>>>>>> On Thu, Dec 11, 2014 at 04:08:43PM +0530, Sanchayan Maity wrote: > >>>>>>>>>>>>> Hello, > >>>>>>>>>>>>> > >>>>>>>>>>>>> I am working on a Freescale Cortex-A5 Vybrid Processor. The chip core > >>>>>>>>>>>>> is clocked at 500MHz and the USB IP core for this is by Chip-idea. I > >>>>>>>>>>>>> am running a 3.18-rc5 kernel on it and trying to use the USB gadget > >>>>>>>>>>>>> functionality. To be more specific the CDC ECM class. Currently, I > >>>>>>>>>>>>> cannot use this properly. If I use just "ping" to check, it works > >>>>>>>>>>>>> fine, but, after running iperf, even one transaction doesn't complete > >>>>>>>>>>>>> or completes rarely. Checking the CDC Ether interface with Wireshark > >>>>>>>>>>>>> shows, TCP Dup Ack messages and checking the USB bus with Wireshark, > >>>>>>>>>>>>> shows packets with USB Protocol Error -71 at one point and after that > >>>>>>>>>>>>> packets with USB connection Reset -104 error. If it's of any > >>>>>>>>>>>>> significance, I have Arch Linux with the 3.18 kernel running on my > >>>>>>>>>>>>> laptop with which the Vybrid connects. On the host side, the only > >>>>>>>>>>>>> error dmesg shows is "kevent 12 may have been dropped". I guess this > >>>>>>>>>>>>> is connected to the "TCP Previous Segment not captured" and "TCP Dup > >>>>>>>>>>>>> ACK" messages. > >>>>>>>>>>>>> > >>>>>>>>>>>>> My script for the gadget configuration is as below: > >>>>>>>>>>>>> > >>>>>>>>>>>>> /bin/mount none /mnt -t configfs > >>>>>>>>>>>>> /bin/mkdir /mnt/usb_gadget/g1 > >>>>>>>>>>>>> cd /mnt/usb_gadget/g1 > >>>>>>>>>>>>> /bin/mkdir configs/c.1 > >>>>>>>>>>>>> /bin/mkdir functions/ecm.0 > >>>>>>>>>>>>> /bin/mkdir strings/0x409 > >>>>>>>>>>>>> /bin/mkdir configs/c.1/strings/0x409 > >>>>>>>>>>>>> echo 0xa4a2 > idProduct > >>>>>>>>>>>>> echo 0x0525 > idVendor > >>>>>>>>>>>>> echo Freescale123 > strings/0x409/serialnumber > >>>>>>>>>>>>> echo Freescale > strings/0x409/manufacturer > >>>>>>>>>>>>> echo "USB Serial Gadget" > strings/0x409/product > >>>>>>>>>>>>> echo "Conf 1" > configs/c.1/strings/0x409/configuration > >>>>>>>>>>>>> echo 200 > configs/c.1/MaxPower > >>>>>>>>>>>>> ln -s functions/ecm.0 configs/c.1 > >>>>>>>>>>>>> echo ci_hdrc.0 > UDC > >>>>>>>>>>>>> /sbin/ifconfig usb0 up > >>>>>>>>>>>>> /sbin/ifconfig usb0 192.168.1.10 > >>>>>>>>>>>>> > >>>>>>>>>>>>> I have debug prints in the udc.c and u_ether.c using pr_debug and > >>>>>>>>>>>> > >>>>>>>>>>>> just a little hint, use any of the dev_*() macros next time, they'll > >>>>>>>>>>>> print the device name which helps figuring out which UDC you're using. > >>>>>>>>>>>> > >>>>>>>>>>>> Based on ci_hdrc.0 above, I suppose it's chipidea and Peter Chen > >>>>>>>>>>>> maintains that one, it really helps adding maintainers to Cc list. > >>>>>>>>>>>> > >>>>>>>>>>>>> enable them when required using dynamic debug. Without running iperf, > >>>>>>>>>>>>> using ping gives me a sequence of prints as below: > >>>>>>>>>>>>> > >>>>>>>>>>>>> [ 277.434409] In eth_start_xmit > >>>>>>>>>>>>> [ 277.434517] In UDC irq > >>>>>>>>>>>>> [ 277.434553] In usb_gadget_giveback_request > >>>>>>>>>>>>> [ 277.434567] In tx_complete > >>>>>>>>>>>>> [ 277.435443] In UDC irq > >>>>>>>>>>>>> [ 277.435477] In usb_gadget_giveback_request > >>>>>>>>>>>>> [ 277.435491] In rx_complete > >>>>>>>>>>>>> [ 277.435517] In rx_submit > >>>>>>>>>>>>> [ 277.435601] In eth_start_xmit > >>>>>>>>>>>>> [ 277.436441] In UDC irq > >>>>>>>>>>>>> [ 277.436465] In usb_gadget_giveback_request > >>>>>>>>>>>>> [ 277.436478] In rx_complete > >>>>>>>>>>>>> [ 277.436493] In rx_submit > >>>>>>>>>>>>> [ 277.436520] In usb_gadget_giveback_request > >>>>>>>>>>>>> [ 277.436533] In tx_complete > >>>>>>>>>>>>> [ 278.434865] In eth_start_xmit > >>>>>>>>>>>>> [ 278.434959] In UDC irq > >>>>>>>>>>>>> [ 278.434993] In usb_gadget_giveback_request > >>>>>>>>>>>>> [ 278.435006] In tx_complete > >>>>>>>>>>>>> [ 278.435881] In UDC irq > >>>>>>>>>>>>> [ 278.435910] In usb_gadget_giveback_request > >>>>>>>>>>>>> [ 278.435923] In rx_complete > >>>>>>>>>>>>> [ 278.435946] In rx_submit > >>>>>>>>>>>>> > >>>>>>>>>>>>> After running iperf without debug prints and then enabling before > >>>>>>>>>>>>> using ping gives me a sequence of prints as below > >>>>>>>>>>>>> [ 81.989827] In UDC irq > >>>>>>>>>>>>> [ 81.989871] In usb_gadget_giveback_request > >>>>>>>>>>>>> [ 81.989886] In rx_complete > >>>>>>>>>>>>> [ 81.989905] In rx_submit > >>>>>>>>>>>>> [ 82.989892] In UDC irq > >>>>>>>>>>>>> [ 82.989951] In usb_gadget_giveback_request > >>>>>>>>>>>>> [ 82.989967] In rx_complete > >>>>>>>>>>>>> [ 82.989992] In rx_submit > >>>>>>>>>>>>> [ 83.990064] In UDC irq > >>>>>>>>>>>>> [ 83.990126] In usb_gadget_giveback_request > >>>>>>>>>>>>> [ 83.990142] In rx_complete > >>>>>>>>>>>>> [ 83.990167] In rx_submit > >>>>>>>>>>>>> [ 84.990007] In UDC irq > >>>>>>>>>>>>> [ 84.990049] In usb_gadget_giveback_request > >>>>>>>>>>>>> [ 84.990064] In rx_complete > >>>>>>>>>>>>> [ 84.990083] In rx_submit > >>>>>>>>>>>>> [ 85.990085] In UDC irq > >>>>>>>>>>>>> [ 85.990147] In usb_gadget_giveback_request > >>>>>>>>>>>>> [ 85.990163] In rx_complete > >>>>>>>>>>>>> [ 85.990188] In rx_submit > >>>>>>>>>>>>> > >>>>>>>>>>>>> If I force a full speed configuration for this USB client port, I get > >>>>>>>>>>>>> a slightly more reliable operation where iperf can run for may be half > >>>>>>>>>>>>> an hour or so or almost an hour before it falls through. Putting in a > >>>>>>>>>>>>> delay of 100-150 microseconds in eth_start_xmit also improves it like > >>>>>>>>>>>>> full speed, but, still not reliable. If I run iperf with debug prints > >>>>>>>>>>>>> enable, this gives similar results to full speed config. After the > >>>>>>>>>>>>> failure of iperf test, even ping doesn't work. Bringing down this usb0 > >>>>>>>>>>>>> interface and then up again makes ping work again. I do realize that > >>>>>>>>>>>>> putting debug prints or delays like this is not the right thing to do, > >>>>>>>>>>>>> especially in ISR, but, just trying to debug. This is my first time > >>>>>>>>>>>>> digging in the USB stack. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Based on the above, it seems there might a subtle bug or race > >>>>>>>>>>>>> condition somewhere in the execution call chain which I have not been > >>>>>>>>>>>>> able to trace yet. Can someone give me some pointers on how I can dig > >>>>>>>>>>>>> and debug further?. > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> I just tried latest usb-next with i.mx6 platform, it works ok with > >>>>>>>>>>> 10 mins iperf bi-direction test. > >>>>>>>>>> > >>>>>>>>>> We did think that it is probably an issue seen with Vybrids only. > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> - Check Vybrid errata to see if any missing in code > >>>>>>>> > >>>>>>>> I had not checked the Vybrid errata. There are two erratas and I think one > >>>>>>>> of them might be relevant to the issue. > >>>>>>>> > >>>>>>>> e6857: Adding dTD to Primed Endpoint may not be recognized > >>>>>>>> > >>>>>>> > >>>>> > >>>>> Sorry, I made a mistake, it is a new errata, and does not be included in > >>>>> the code. All imx project uses 2.0a or 2.50a which does not need this > >>>>> errata, and Vybrid uses 2.40a core which needs this errata, I will do a > >>>>> patch for this soon, but before that, would you read your ID register > >>>>> ($BASE + 0x0) for me? I would like to confirm if your REVISION value > >>>>> is 0100b. > >>>>> > >>>> > >>>> As per the reference manual and also the devmem2 readout of the USB ID register > >>>> the value is 0xE481FA05. This gives a Revision number 0x81 for the controller > >>>> core and 0x05 for the ID viz. configuration number. > >>>> > >>> > >>> Thanks, Revision Number is bit 21 - bit 24, it is 0100b if your ID value > >>> is 0xE481FA05. > >>> > >> > >> I was wondering how you got the numbers exactly. The reference manual shows > >> bit 16 - bit 23 as the revision of the controller core, bit 0 - bit 5 as the > >> ID and bit 24 - bit 31 are reserved. Just for my own information is there a > >> different interpretation as well? > >> > >> Just to be clear so this confirms that the Vybrid has 2.40 core which needs the > >> errata and which is presently not implemented in software, which results in the > >> issue we are seeing? Otherwise I will just carry on my testing and also try and > >> see if I can implement the errata fix. > >> > >> Thanks for your inputs. > >> > >> -Regards, > >> Sanchayan. > >> > > > > Hi Sanchayan, > > > > Please try below patch to see if it can fix your problem: > > > > From 602a4db2b18f4451dda5b9365f127eae88ff68fb Mon Sep 17 00:00:00 2001 > > From: Peter Chen <peter.chen@xxxxxxxxxxxxx> > > Date: Thu, 18 Dec 2014 12:47:36 +0800 > > Subject: [PATCH 1/1] for 2.40a errata > > > > Signed-off-by: Peter Chen <peter.chen@xxxxxxxxxxxxx> > > --- > > drivers/usb/chipidea/udc.c | 28 +++++++++++++++++++++++++++- > > 1 file changed, 27 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/usb/chipidea/udc.c b/drivers/usb/chipidea/udc.c > > index c0754be..62b4625 100644 > > --- a/drivers/usb/chipidea/udc.c > > +++ b/drivers/usb/chipidea/udc.c > > @@ -201,7 +201,7 @@ static int hw_ep_prime(struct ci_hdrc *ci, int num, int dir, int is_ctrl) > > if (is_ctrl && dir == RX && hw_read(ci, OP_ENDPTSETUPSTAT, BIT(num))) > > return -EAGAIN; > > > > - /* status shoult be tested according with manual but it doesn't work */ > > + /* status should be tested according with manual but it doesn't work */ > > return 0; > > } > > > > @@ -522,6 +522,18 @@ static void free_pending_td(struct ci_hw_ep *hwep) > > kfree(pending); > > } > > > > +static int reprime_dtd(struct ci_hdrc *ci, struct ci_hw_ep *hwep, struct td_node *node) > > +{ > > + hwep->qh.ptr->td.next = cpu_to_le32(node->ptr->next); > > + hwep->qh.ptr->td.token &= > > + cpu_to_le32(~(TD_STATUS_HALTED|TD_STATUS_ACTIVE)); > > + > > + wmb(); /* synchronize before ep prime */ > > + > > + return hw_ep_prime(ci, hwep->num, hwep->dir, > > + hwep->type == USB_ENDPOINT_XFER_CONTROL); > > +} > > + > > /** > > * _hardware_dequeue: handles a request at hardware level > > * @gadget: gadget > > @@ -535,6 +547,7 @@ static int _hardware_dequeue(struct ci_hw_ep *hwep, struct ci_hw_req *hwreq) > > struct td_node *node, *tmpnode; > > unsigned remaining_length; > > unsigned actual = hwreq->req.length; > > + struct ci_hdrc *ci = hwep->ci; > > > > if (hwreq->req.status != -EALREADY) > > return -EINVAL; > > @@ -582,6 +595,19 @@ static int _hardware_dequeue(struct ci_hw_ep *hwep, struct ci_hw_req *hwreq) > > list_del_init(&node->td); > > } > > > > +/* if (ci->rev == CI_REVISION_24) */ > > + while ((hwep->qh.ptr->curr == node->dma)) { > > + if (node->ptr->next != TD_TERMINATE) { > > + int n = hw_ep_bit(hwep->num, hwep->dir); > > + > > + /* Only do re-prime when both ENDPTPRIME bit and ENDPTSTAT bit are 0 */ > > + if (!hw_read(ci, OP_ENDPTPRIME, BIT(n)) && > > + !hw_read(ci, OP_ENDPTSTAT, BIT(n))) > > + reprime_dtd(ci, hwep, node); > > + } > > + udelay(1); > > + } > > + > > usb_gadget_unmap_request(&hwep->ci->gadget, &hwreq->req, hwep->dir); > > > > hwreq->req.actual += actual; > > > > Applied and tested the patch. Still the same condition. How about change condition like: > > + if (!hw_read(ci, OP_ENDPTSTAT, BIT(n))) > > + reprime_dtd(ci, hwep, node); If you still meet issue, you may help me indicate your "condition" detail. One clue, Marvell SoC may has this problem too, and they submit a patch below (In fact, I "steal" most of its idea). https://lkml.org/lkml/2014/2/25/43 -- Best Regards, Peter Chen -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html