Hi, On Fri, May 06, 2016 at 10:41:46AM +0000, Andrew Goodbody wrote: > > From: Bin Liu [mailto:b-liu@xxxxxx] > > On Thu, May 05, 2016 at 04:02:55PM +0000, Andrew Goodbody wrote: > > > > From: Bin Liu [mailto:b-liu@xxxxxx] > > > > On Thu, May 05, 2016 at 03:12:00PM +0000, Andrew Goodbody wrote: > > > > > > From: Bin Liu [mailto:b-liu@xxxxxx] Hi, > > > > > > > > > > > > On Thu, May 05, 2016 at 12:22:33PM +0000, Andrew Goodbody wrote: > > > > > > > > From: Bin Liu [mailto:b-liu@xxxxxx] Hi, > > > > > > > > > > > > > > > > On Wed, May 04, 2016 at 03:48:50PM +0000, Andrew Goodbody > > > > wrote: > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > I have been investigating communication issues with iPads. > > > > > > > > > When the system is busy it seems that the musb driver is > > > > > > > > > silently dropping occasional packets. I have a usbmon > > > > > > > > > trace that does not show the packet and I have a trace > > > > > > > > > from a hardware USB analyser that does show the packet. So > > > > > > > > > the device is correctly sending the packet, it is even > > > > > > > > > being ACKed, but it is not passed up to the application. > > > > > > > > > The packet is a bulk transfer packet of 20 bytes. Can > > > > > > > > > anyone please give me pointers to where to go looking for > > > > > > > > > the problem? The syslog shows > > > > nothing relevant. > > > > > > > > > > > > > > > > What is the part number on the am3352 chip? > > > > > > > > > > > > > > AM3352BZCZ100 > > > > > > > > > > > > > > > What kernel version do you use? > > > > > > > > > > > > > > 4.5.0 > > > > > > > > > > > > > > > Is musb cppi dma enabled? If so, does the problem still > > > > > > > > happen when CPPI disabled? > > > > > > > > > > > > > > Yes. Yes. When testing with PIO I did get the message "Rx > > > > > > > interrupt with no > > > > > > errors or packet!". > > > > > > > > > > > > > > > First try to turn on dynamic debug log in musb_host.c to > > > > > > > > check if musb receives the packet or not. > > > > > > > > > > > > > > > > Regards, > > > > > > > > -Bin. > > > > > > > > > > > > > > I am having problems doing this. If I enable the whole file > > > > > > > then I get lots of messages on the console about /dev/kmsg > > > > > > > buffer overrun. There are more then 26 million packets in the > > > > > > > hardware trace and I have not worked out how to correlate any > > > > > > > of the possible message from dynamic debug with those packets > > > > > > > even when I enable some of the dynamic debug lines. I can see > > > > > > > a few messages about "DMA complete but packet still in FIFO, > > > > > > > CSR 2103" and just the > > > > occasional "extra TX2 ready, csr 2100" > > > > > > > when I enable some of the lines for dynamic debug. > > > > > > > > > > > > Well, this issue would not be easy to debug. Is this with your > > > > > > custom > > > > board? > > > > > > If so, have you run EyeDiagram test to rule out signal integrity > > > > > > problem? Are you able to reproduce it with any TI EVM, such as > > > > > > Beaglebone Black? If so, please explain the detail of the test > > > > > > case, I could try to reproduce it on my side. > > > > > > > > > > Yes this is on a custom board and yes we did EyeDiagram tests. > > > > > Also the ACK from the controller is seen, so that should rule out > > > > > signal integrity issues. I have just reproduced this on the > > > > > Beaglebone Black using the latest TI SDK. Do you have access to 16 > > > > > iPads with lightning connectors and do you have a Mac running 10.10.x? > > > > > > > > No, I don't have those :( > > > > > > > > 16 devices connecting to musb sounds too many. what is the ep info > > > > in the descriptor of the ipad device? > > > > > > T: Bus=02 Lev=04 Prnt=06 Port=06 Cnt=07 Dev#= 19 Spd=480 MxCh= 0 > > > D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 4 > > > P: Vendor=05ac ProdID=12ab Rev= 3.40 > > > S: Manufacturer=Apple Inc. > > > S: Product=iPad > > > S: SerialNumber=1da5f4610eafb36fa1e9eead80a56cb2db11dfce > > > C: #Ifs= 1 Cfg#= 1 Atr=c0 MxPwr=500mA > > > I: If#= 0 Alt= 0 #EPs= 3 Cls=06(still) Sub=01 Prot=01 Driver= > > > E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=1250us > > > E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms > > > E: Ad=83(I) Atr=03(Int.) MxPS= 64 Ivl=64ms > > > C: #Ifs= 3 Cfg#= 2 Atr=c0 MxPwr=500mA > > > I: If#= 0 Alt= 0 #EPs= 0 Cls=01(audio) Sub=01 Prot=00 Driver= > > > I: If#= 1 Alt= 0 #EPs= 0 Cls=01(audio) Sub=02 Prot=00 Driver= > > > I: If#= 1 Alt= 1 #EPs= 1 Cls=01(audio) Sub=02 Prot=00 Driver= > > > E: Ad=81(I) Atr=01(Isoc) MxPS= 192 Ivl=1ms > > > I: If#= 2 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=00 Prot=00 Driver= > > > E: Ad=83(I) Atr=03(Int.) MxPS= 64 Ivl=125us > > > C:* #Ifs= 2 Cfg#= 3 Atr=c0 MxPwr=500mA > > > I:* If#= 0 Alt= 0 #EPs= 3 Cls=06(still) Sub=01 Prot=01 Driver=usbfs > > > E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=1250us > > > E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms > > > E: Ad=83(I) Atr=03(Int.) MxPS= 64 Ivl=64ms > > > I:* If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=fe Prot=02 Driver=usbfs > > > E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms > > > E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms > > > C: #Ifs= 3 Cfg#= 4 Atr=c0 MxPwr=500mA > > > I: If#= 0 Alt= 0 #EPs= 3 Cls=06(still) Sub=01 Prot=01 Driver= > > > E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=1250us > > > E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms > > > E: Ad=83(I) Atr=03(Int.) MxPS= 64 Ivl=64ms > > > I: If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=fe Prot=02 Driver= > > > E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms > > > E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms > > > I: If#= 2 Alt= 0 #EPs= 0 Cls=ff(vend.) Sub=fd Prot=01 Driver= > > > I: If#= 2 Alt= 1 #EPs= 2 Cls=ff(vend.) Sub=fd Prot=01 Driver= > > > E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms > > > E: Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms > > > I: If#= 2 Alt= 2 #EPs= 2 Cls=ff(vend.) Sub=fd Prot=01 Driver= > > > E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms > > > E: Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms > > > > > > Config #4 is the relevant one. Interface #0 quickly uses up the > > > controller endpoints and then starts to return -ENOSPC thereafter for > > > the interrupt URBs from later iPads, but that does not matter for my > > > use case. Interface #1 is the one that is actually used, just a bulk > > > in and a bulk out endpoint. So some URBs get opened on controller > > > endpoints 2-15, but most are just queued up on controller endpoint 1. > > > Yes this will keep the controller busy. > > > > Well, I am not sure musb is able to support your use case. As you just said, > > the required endpoints are more than musb provides, so some urbs are > > queued to ep1, which is not ideal, but the driver is just designed that way. > > What are you not sure about? Why do you think that there could be an issue? I didn't have a use case to investigate this scenario, but think about the case that multiple urbs to different devices are queued into a single musb hw ep, if for whatever reason one urb completion is delayed, other queued urbs will not be executed, it would cause timeout in the driver to other devices. I don't know if it would cause other problem though, since I don't have a setup to look at this scenario. > No, the queuing on ep1 is not ideal, but as you say the driver is > designed that way and so it should either work or else return an > error. Silently dropping URBs is a bug and is not acceptable behaviour > in a driver. We don't know who drops the packet yet, could be the driver, or could be the hw too. > > > Honestly, I haven't checked what would happen if some bulk endpoint urbs > > are queued to ep1. > > ! > Well on the USBMON trace I can see at least 60 active bulk rx URBs in > use simultaneously for >10s, so the queueing is certainly working, at > least for the huge majority of URBs. The point when the dropped URB > occurs there are only around 28 in use. > > > > Although I said 16 iPads, this problem will happen with fewer, it just > > > becomes rarer. > > > > How many? Please keep in mind that you probably have 5 hubs connected, > > which take 5 INT endpoints permanently. > > I have not completely characterised this. As I said, it gets rarer > with fewer and so the test times get longer and longer. There are in > fact 4 hubs in use but there is also an FTDI USB-serial adapter. The > iPads themselves have four rx bulk URBs from if#1, and a bulk and int > rx URBs from if#0, so each iPad can use up 6 controller endpoints, so > the queueing code is being put into use very early with as little as 2 > or 3 iPads connected. I have not seen problems with this few iPads, > but that may be just because I have not been able to test for long > enough. It all rather depends on whether there is a corner case not > handled properly that I am triggering reliably just due to the sheer > volume of USB traffic or else it may be caused by something that is > CPU load dependent. If this is just a rare corner case then it could > affect anyone at some point or other. > > The pattern that I see is always the same. Each iPad is started > talking to by the application, one at a time, about 0.5s between each > start. At the start 4 rx URBs are requested on device endpoint #5, > then a tx URB is sent to device endpoint 4. The iPad replies to this > tx URB on one of the rx URBs. I see this rx URB on the wire and the > ACK from the controller. In the working case there is then lots of > data transferred over those 4 rx URBs with the occasional prompt from > a tx URB. In the failing case there is no more traffic on that > interface as the application never sees the initial reply. I have no > control over this application, its behaviour cannot be changed. > As I said this is not trivial to debug. I will see what I can do with my limited spare time. Regards, -Bin. > Andrew > > > > Andrew > > > > > > > > > > > > > > > > > Andrew > > > > > > > > > > > > Regards, > > > > > > -Bin. > > > > Regards, > > -Bin. > -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html