Re: USB lockup on OMAP3530

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wednesday 27 January 2010 00:13:48 Robert Nelson wrote:
> On Tue, Jan 26, 2010 at 4:30 PM, Andreas Hartmetz <ahartmetz@xxxxxxxxx> 
wrote:
> > Hi,
> > 
> > I have a problem with the USB OTG port on my Beagle Board revision B6 -
> > the OTG port is the only USB port on that device so it's critical that
> > it works. Everybody on this list probably knows this hardware, I'll just
> > say that it has an ARM Cortex-A8 CPU (OMAP3530) with a built-in USB
> > interface. The driver is musb_hdrc.
> > I'm using this interface in host mode (no gadget mode support compiled
> > in, though this doesn't seem to make a difference) to attach necessary
> > peripherals via a powered USB hub. The make and model of hub do not play
> > a role, I've tried several.
> > 
> > The problem:
> > Whenever the following conditions are met all hardware on the port
> > "disappears" after a few minutes, cutting off the board from network and
> > storage: - Network traffic over USB (doesn't matter if it's regular 100
> > Mbps Ethernet or a WLAN stick)
> > - Disk "traffic" over USB - I use a 2,5" disk in a USB enclosure
> > - CPU load - not sure if this is necessary
> > The only fix seems to be a reboot.
> > 
> > This happens to me when compiling something on the board (easier than
> > cross-compiling) with the help of Icecream, which is a kind of distcc on
> > steroids. On #beagle on Freenode IRC I've found somebody with the same
> > problem when scp-ing a large file (several hundred megabytes) off or
> > onto the USB disk. I can reproduce that as well.
> > I have spent a lot of time trying to find fixes and/or workarounds but
> > nothing worked so far except using the "validation kernel" recommended
> > to test the hardware e.g. after receipt:
> > http://code.google.com/p/beagleboard/wiki/BeagleBoardDiagnostics.
> > Fittingly, the link to the kernel image is currently broken. The board
> > usually locked itself up after about a day when using that ancient
> > kernel, so it's not an option either, and probably quite unmaintained
> > and buggy in many other areas.
> > 
> > So far I have tried many versions of the linux-omap and linux-omap-pm
> > kernel, from about 2.6.30 to the latest git version. They all exhibit
> > the USB OTG death bug.
> > I've used kernels with openembedded patches and without, currently
> > without. Yesterday I discovered the musb_hdrc.fifo_mode parameter and
> > played around with it. I also modified the given configurations. Result:
> > - FIFO configurations including .mode = BUF_DOUBLE don't work at all - no
> >  devices work.
> > - the USB death bug is not fixed by:
> >  - using only one endpoint
> >  - using no TXRX entries but only separate RX and TX
> >    (every endpoint gets a TX and an RX entry though)
> >  - using a large number of endpoints with same maxpacket value
> > ... still no solution.
> > 
> > Enabling debug output of the musb_hdrc driver (yes I've also compiled in
> > debug messages) is not very practical due to the high volume of
> > messages; also, when the bug occurs nothing special is printed. The
> > first error usually comes from the memory manager / filesystem
> > complaining that it can't do "IO to offline device", i.e. the
> > disappeared external harddisk (which contains the swapfile).
> > 
> > I would *really* appreciate somebody looking into this because this
> > currently makes the hardware as useful as a brick for me. I can supply
> > debug output and test patches.
> > 
> > Cheers,
> > Andreas
> 
> Hi Andreas,
> 
> Give this patch a try:
> http://bazaar.launchpad.net/~beagleboard-kernel/%2Bjunk/2.6-stable/annotate
> /head%3A/patches/musb/fifo-change.patch
> 
> It's in Angstrom's 2.6.29, my 2.6.31/32... And a real fix just went
> into mainline 2.6.33-rc5, with it my otg port is solid as a rock and
> can transfer large amounts of data between shared devices on the
> otg/musb port...
> 
> http://rcn-ee.homeip.net:81/dl/farm/log/2.6.32.6-x6.0_beagle-128mb-0_musb-s
> tress-test.txt
> 
> Regards,

Nice try, but no success.

I've tried the fifo-change patch against the pm branch (with 
musb_hdrc.fifo_mode=4 obviously, otherwise it wouldn't do anything) and also 
the 2.6.32.6-x6.0 kernel from you. Nothing is fixed for me. With your kernel I 
have an additional problem I don't have with pm: it takes several tries to 
bring up the USB WLAN interface to a working state, with the rt73usb driver.
Basically I start and kill wpa_supplicant, unload the module a few times and 
whatnot until it works. DHCP seems to work almost every time but no IP or ICMP 
packets come through. I know it sounds strange.
When it works ping response times range from one to ten seconds (?!) whereas 
they are in the several milliseconds range with pm.
I tried your latest kernel with and without musb_hdrc.fifo_mode=4 to be sure.
Which fix in mainline do you mean btw?
As long as no particular commit fixes the bug I consider it fixed by accident, 
and it may break again by accident.
I suspect that your stress test is not stressful enough. Try scp or anything 
that loads the CPU heavily while network and disk I/O is going on.
Maybe throw in some swapping with swap on the external disk for good measure, 
git gc seems to work well for that.
If all else fails I can publish a filesystem image with a distributed build 
ready to run.

Andreas
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Arm (vger)]     [ARM Kernel]     [ARM MSM]     [Linux Tegra]     [Linux WPAN Networking]     [Linux Wireless Networking]     [Maemo Users]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux