On Sat, 28 Feb 2009, Chris Frey wrote: > On Thu, Feb 26, 2009 at 08:52:09PM -0500, Paul O'Keefe wrote: > > I was generous and gave the device 100,000 micro seconds to settle. And > > boom (using your word!) all was well. It runs btool correctly, it syncs > > correctly and it's happy. And that's a solution that should be generic > > across the universe of devices we have to play with. I'll send Chris a > > patch based on his latest version and we can move forward from there. > > I was finally able to reproduce the original timeout bug using a VM > containing Debian Lenny. > > There were two issues confusing the debugging process. One was that the > timeout appeared to be 1ms in gdb, but this was just a stack anomaly... > stepping a bit farther changed gdb's results... probably due to using > -O2 for the build. > > The other issue was a timeout in libusb's usb_bulk_read() function. > This depended on: > > - a fairly modern tool chain (e.g. Debian Lenny or Ubuntu Jaunty) > - a 2.6.28 kernel or higher (2.6.27 never shows this issue) > > My setup is a Debian Etch system running 2.6.28.7, and a Debian Lenny > system running in a QEMU VM. btool (compiled with g++ 4.1.2 on Etch) > runs fine on 2.6.28.7 and 2.6.28.4 on the bare Etch system. But if I run > btool in the VM Lenny system using 2.6.28.4, compiled with g++ 4.3.2, > I get a timeout. Adding a delay doesn't seem to help. You've got so many variables here, it's hard to keep things straight: Distribution (Etch vs. Lenny), Kernel (2.6.28.4 vs. 2.6.28.7), Environment (bare vs. VM), Compiler (4.1.2 vs. 4.3.2). Maybe other things as well. The cardinal rule of debugging is to vary only one ingredient at a time. If you can do a head-to-head comparison in which the only variation is in userspace (compiler and/or distribution), then looking at the usbmon logs can help. If you vary either the kernel or the environment, then everything else must remain unchanged if you want a meaningful result. > All kernels I'm using for testing, on both systems, are compiled on the > Etch system with gcc-3.3. > > I've been slowly running through a git bisect of the kernel to try to > narrow this down. I have to be careful, since there was another USB > bug that was fixed in 2.6.28.2 with commit: > 73cb49b8860d9336ee4b24ecbc0d2358aff862f7 > > So far it's down to about 36 commits: > > good: eef70b217a7ab46e8e0cf75982ad75d8305d5591 > bad: 0bc77ecbe4f69ff8ead1d2abfe84ca9ba2a7bca4 > > The bad commit fails twice. > > 1) compiled as-is, it gives > error in usb_bulk_write(): -2 error submitting URB: > No such file or directory > > 2) when the 2.6.28.2 patch is applied, it gives the timeout > in usb_bulk_read() Some of these problems certainly could be related to the toggle-value error fixed by the commit you mention above. There was a second, similar toggle-value error, not fixed by that commit; it might be tripping you up as well. It was fixed by b7055fa7953a23512ea7d4f97cc5ac209e14a64a. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html