Re: MUSB driver on AM3352 dropping USB packets

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Fri, May 06, 2016 at 10:41:46AM +0000, Andrew Goodbody wrote:
> > From: Bin Liu [mailto:b-liu@xxxxxx]
> > On Thu, May 05, 2016 at 04:02:55PM +0000, Andrew Goodbody wrote:
> > > > From: Bin Liu [mailto:b-liu@xxxxxx]
> > > > On Thu, May 05, 2016 at 03:12:00PM +0000, Andrew Goodbody wrote:
> > > > > > From: Bin Liu [mailto:b-liu@xxxxxx] Hi,
> > > > > >
> > > > > > On Thu, May 05, 2016 at 12:22:33PM +0000, Andrew Goodbody wrote:
> > > > > > > > From: Bin Liu [mailto:b-liu@xxxxxx] Hi,
> > > > > > > >
> > > > > > > > On Wed, May 04, 2016 at 03:48:50PM +0000, Andrew Goodbody
> > > > wrote:
> > > > > > > > > Hi,
> > > > > > > > >
> > > > > > > > > I have been investigating communication issues with iPads.
> > > > > > > > > When the system is busy it seems that the musb driver is
> > > > > > > > > silently dropping occasional packets. I have a usbmon
> > > > > > > > > trace that does not show the packet and I have a trace
> > > > > > > > > from a hardware USB analyser that does show the packet. So
> > > > > > > > > the device is correctly sending the packet, it is even
> > > > > > > > > being ACKed, but it is not passed up to the application.
> > > > > > > > > The packet is a bulk transfer packet of 20 bytes. Can
> > > > > > > > > anyone please give me pointers to where to go looking for
> > > > > > > > > the problem? The syslog shows
> > > > nothing relevant.
> > > > > > > >
> > > > > > > > What is the part number on the am3352 chip?
> > > > > > >
> > > > > > > AM3352BZCZ100
> > > > > > >
> > > > > > > > What kernel version do you use?
> > > > > > >
> > > > > > > 4.5.0
> > > > > > >
> > > > > > > > Is musb cppi dma enabled? If so, does the problem still
> > > > > > > > happen when CPPI disabled?
> > > > > > >
> > > > > > > Yes. Yes. When testing with PIO I did get the message "Rx
> > > > > > > interrupt with no
> > > > > > errors or packet!".
> > > > > > >
> > > > > > > > First try to turn on dynamic debug log in musb_host.c to
> > > > > > > > check if musb receives the packet or not.
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > -Bin.
> > > > > > >
> > > > > > > I am having problems doing this. If I enable the whole file
> > > > > > > then I get lots of messages on the console about /dev/kmsg
> > > > > > > buffer overrun. There are more then 26 million packets in the
> > > > > > > hardware trace and I have not worked out how to correlate any
> > > > > > > of the possible message from dynamic debug with those packets
> > > > > > > even when I enable some of the dynamic debug lines.  I can see
> > > > > > > a few messages about "DMA complete but packet still in FIFO,
> > > > > > > CSR 2103" and just the
> > > > occasional "extra TX2 ready, csr 2100"
> > > > > > > when I enable some of the lines for dynamic debug.
> > > > > >
> > > > > > Well, this issue would not be easy to debug. Is this with your
> > > > > > custom
> > > > board?
> > > > > > If so, have you run EyeDiagram test to rule out signal integrity
> > > > > > problem? Are you able to reproduce it with any TI EVM, such as
> > > > > > Beaglebone Black? If so, please explain the detail of the test
> > > > > > case, I could try to reproduce it on my side.
> > > > >
> > > > > Yes this is on a custom board and yes we did EyeDiagram tests.
> > > > > Also the ACK from the controller is seen, so that should rule out
> > > > > signal integrity issues.  I have just reproduced this on the
> > > > > Beaglebone Black using the latest TI SDK. Do you have access to 16
> > > > > iPads with lightning connectors and do you have a Mac running 10.10.x?
> > > >
> > > > No, I don't have those :(
> > > >
> > > > 16 devices connecting to musb sounds too many. what is the ep info
> > > > in the descriptor of the ipad device?
> > >
> > > T:  Bus=02 Lev=04 Prnt=06 Port=06 Cnt=07 Dev#= 19 Spd=480  MxCh= 0
> > > D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  4
> > > P:  Vendor=05ac ProdID=12ab Rev= 3.40
> > > S:  Manufacturer=Apple Inc.
> > > S:  Product=iPad
> > > S:  SerialNumber=1da5f4610eafb36fa1e9eead80a56cb2db11dfce
> > > C:  #Ifs= 1 Cfg#= 1 Atr=c0 MxPwr=500mA
> > > I:  If#= 0 Alt= 0 #EPs= 3 Cls=06(still) Sub=01 Prot=01 Driver=
> > > E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=1250us
> > > E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
> > > E:  Ad=83(I) Atr=03(Int.) MxPS=  64 Ivl=64ms
> > > C:  #Ifs= 3 Cfg#= 2 Atr=c0 MxPwr=500mA
> > > I:  If#= 0 Alt= 0 #EPs= 0 Cls=01(audio) Sub=01 Prot=00 Driver=
> > > I:  If#= 1 Alt= 0 #EPs= 0 Cls=01(audio) Sub=02 Prot=00 Driver=
> > > I:  If#= 1 Alt= 1 #EPs= 1 Cls=01(audio) Sub=02 Prot=00 Driver=
> > > E:  Ad=81(I) Atr=01(Isoc) MxPS= 192 Ivl=1ms
> > > I:  If#= 2 Alt= 0 #EPs= 1 Cls=03(HID  ) Sub=00 Prot=00 Driver=
> > > E:  Ad=83(I) Atr=03(Int.) MxPS=  64 Ivl=125us
> > > C:* #Ifs= 2 Cfg#= 3 Atr=c0 MxPwr=500mA
> > > I:* If#= 0 Alt= 0 #EPs= 3 Cls=06(still) Sub=01 Prot=01 Driver=usbfs
> > > E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=1250us
> > > E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
> > > E:  Ad=83(I) Atr=03(Int.) MxPS=  64 Ivl=64ms
> > > I:* If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=fe Prot=02 Driver=usbfs
> > > E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
> > > E:  Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
> > > C:  #Ifs= 3 Cfg#= 4 Atr=c0 MxPwr=500mA
> > > I:  If#= 0 Alt= 0 #EPs= 3 Cls=06(still) Sub=01 Prot=01 Driver=
> > > E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=1250us
> > > E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
> > > E:  Ad=83(I) Atr=03(Int.) MxPS=  64 Ivl=64ms
> > > I:  If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=fe Prot=02 Driver=
> > > E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
> > > E:  Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
> > > I:  If#= 2 Alt= 0 #EPs= 0 Cls=ff(vend.) Sub=fd Prot=01 Driver=
> > > I:  If#= 2 Alt= 1 #EPs= 2 Cls=ff(vend.) Sub=fd Prot=01 Driver=
> > > E:  Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
> > > E:  Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
> > > I:  If#= 2 Alt= 2 #EPs= 2 Cls=ff(vend.) Sub=fd Prot=01 Driver=
> > > E:  Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
> > > E:  Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
> > >
> > > Config #4 is the relevant one. Interface #0 quickly uses up the
> > > controller endpoints and then starts to return -ENOSPC thereafter for
> > > the interrupt URBs from later iPads, but that does not matter for my
> > > use case. Interface #1 is the one that is actually used, just a bulk
> > > in and a bulk out endpoint. So some URBs get opened on controller
> > > endpoints 2-15, but most are just queued up on controller endpoint 1.
> > > Yes this will keep the controller busy.
> > 
> > Well, I am not sure musb is able to support your use case. As you just said,
> > the required endpoints are more than musb provides, so some urbs are
> > queued to ep1, which is not ideal, but the driver is just designed that way.
> 
> What are you not sure about? Why do you think that there could be an issue?

I didn't have a use case to investigate this scenario, but think about
the case that multiple urbs to different devices are queued into a
single musb hw ep, if for whatever reason one urb completion is delayed,
other queued urbs will not be executed, it would cause timeout in the
driver to other devices. I don't know if it would cause other problem
though, since I don't have a setup to look at this scenario.

> No, the queuing on ep1 is not ideal, but as you say the driver is
> designed that way and so it should either work or else return an
> error. Silently dropping URBs is a bug and is not acceptable behaviour
> in a driver.

We don't know who drops the packet yet, could be the driver, or could be
the hw too.

> 
> > Honestly, I haven't checked what would happen if some bulk endpoint urbs
> > are queued to ep1.
> 
> !
> Well on the USBMON trace I can see at least 60 active bulk rx URBs in
> use simultaneously for >10s, so the queueing is certainly working, at
> least for the huge majority of URBs. The point when the dropped URB
> occurs there are only around 28 in use.
> 
> > > Although I said 16 iPads, this problem will happen with fewer, it just
> > > becomes rarer.
> > 
> > How many? Please keep in mind that you probably have 5 hubs connected,
> > which take 5 INT endpoints permanently.
> 
> I have not completely characterised this. As I said, it gets rarer
> with fewer and so the test times get longer and longer. There are in
> fact 4 hubs in use but there is also an FTDI USB-serial adapter.  The
> iPads themselves have four rx bulk URBs from if#1, and a bulk and int
> rx URBs from if#0, so each iPad can use up 6 controller endpoints, so
> the queueing code is being put into use very early with as little as 2
> or 3 iPads connected. I have not seen problems with this few iPads,
> but that may be just because I have not been able to test for long
> enough. It all rather depends on whether there is a corner case not
> handled properly that I am triggering reliably just due to the sheer
> volume of USB traffic or else it may be caused by something that is
> CPU load dependent. If this is just a rare corner case then it could
> affect anyone at some point or other.
> 
> The pattern that I see is always the same. Each iPad is started
> talking to by the application, one at a time, about 0.5s between each
> start. At the start 4 rx URBs are requested on device endpoint #5,
> then a tx URB is sent to device endpoint 4. The iPad replies to this
> tx URB on one of the rx URBs. I see this rx URB on the wire and the
> ACK from the controller. In the working case there is then lots of
> data transferred over those 4 rx URBs with the occasional prompt from
> a tx URB. In the failing case there is no more traffic on that
> interface as the application never sees the initial reply. I have no
> control over this application, its behaviour cannot be changed.
> 

As I said this is not trivial to debug. I will see what I can do with my
limited spare time.

Regards,
-Bin.

> Andrew
> 
> > > Andrew
> > >
> > > > > > >
> > > > > > > Andrew
> > > > > >
> > > > > > Regards,
> > > > > > -Bin.
> > 
> > Regards,
> > -Bin.
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux