On Tue, 23 May 2017, Marek Pikarski wrote: > Hi Alan, > First of all, many thanks for the reply! Please read my comments below... > > On 22.05.2017 21:46, Alan Stern wrote: > > On Mon, 22 May 2017, Marek Pikarski wrote: > > > >> Hi! > >> > >> I am currently hunting the source of an issue that we have with an USB > >> modem. There could even be some HW / board design issue. > >> > >> My idea was to try the kernel USB unit tests and this seems to work out > >> as it gave me some interesting results! > >> > >> It would be so great if someone on this list had an idea what the test > >> results mean and whether these results are pointing to a certain problem... > >> > >> Here the interesting part (TEST 10 is the one always failing) of the > >> testusb output: > >> > >> # ./testusb -l -D /dev/bus/usb/001/003 -t 10 > >> unknown speed /dev/bus/usb/001/003 0 > >> /dev/bus/usb/001/003 test 10 --> 32 (Broken pipe) > >> /dev/bus/usb/001/003 test 10, 8.011131 secs > >> /dev/bus/usb/001/003 test 10, 8.010904 secs > >> /dev/bus/usb/001/003 test 10 --> 32 (Broken pipe) > >> /dev/bus/usb/001/003 test 10 --> 121 (Remote I/O error) > >> /dev/bus/usb/001/003 test 10 --> 32 (Broken pipe) > >> /dev/bus/usb/001/003 test 10 --> 32 (Broken pipe) > >> /dev/bus/usb/001/003 test 10 --> 32 (Broken pipe) > >> /dev/bus/usb/001/003 test 10 --> 32 (Broken pipe) > >> /dev/bus/usb/001/003 test 10, 8.012122 secs > >> /dev/bus/usb/001/003 test 10 --> 32 (Broken pipe) > >> ... > > Test 10 tests how well control messages are handled when they are > > queued up. It has a large number of subcases that get tried > > iteratively, and the whole thing is repeated, in this case 1000 times. > > Do you possibly have any idea why its only TEST 10 which is failing? No. > Compared to the other tests, does TEST 10 (queued control messages) > require some special treatment by the HW or USB-bus or kernel or something? Just the ability to manage a queue of control requests. > Maybe you could even images what could be the reason for this? Like > USB-bus clock or speed? No idea. > >> And here the corresponding info from the usbtest kernel module (dmesg > >> output): > >> > >> [15298.990000] usbtest 1-2:3.0: TEST 10: queue 32 control calls, 1000 times > >> [15300.330000] usbtest 1-2:3.0: subtest 4 error, status -32 > >> [15300.330000] usbtest 1-2:3.0: control queue 80.00, err -32, 26667 > >> left, subcase 4, len 0/2 > > Subtest 4 is a Get-Device-Status control request. It is not expected > > to fail. The log indicates that the device sent a STALL instead of a > > proper reply -- that's what the -32 means. However, the subtest did > > complete correctly over a hundred times before the failure occurred. > > > >> [15300.350000] usbtest 1-2:3.0: subcase 5 completed out of order, last 3 > > This means subtest 3 was the one that completed previously. This > > should not happen -- requests are always supposed to complete in order, > > so subtest 4 should have completed before 5. > > > >> [15300.350000] usbtest 1-2:3.0: subcase 4 completed out of order, last 4 > > That's even weirder. I don't see how that could happen. But the > > error pathways in this driver have not been tested thoroughly. > > I also cannot understand why this can happen... Do you know at which > level this kind of Queue for these messages is implemented, is it kernel > driver or maybe device firmware? In the kernel driver. > >> [15300.370000] usbtest 1-2:3.0: TEST 10: queue 32 control calls, 1000 times > >> [15308.380000] usbtest 1-2:3.0: TEST 10: queue 32 control calls, 1000 times > >> [15316.390000] usbtest 1-2:3.0: TEST 10: queue 32 control calls, 1000 times > >> [15322.230000] usbtest 1-2:3.0: subtest 4 error, status -32 > >> [15322.230000] usbtest 1-2:3.0: control queue 80.00, err -32, 8715 left, > >> subcase 4, len 0/2 > >> [15322.240000] usbtest 1-2:3.0: subcase 5 completed out of order, last 3 > >> [15322.250000] usbtest 1-2:3.0: subcase 4 completed out of order, last 4 > >> [15322.260000] usbtest 1-2:3.0: TEST 10: queue 32 control calls, 1000 times > >> [15324.870000] usbtest 1-2:3.0: subtest 9 error, status -121 > >> [15324.870000] usbtest 1-2:3.0: control queue 82.00, err -121, 21574 > >> left, subcase 9, len 0/2 > > Error code -121 means a short reply was received -- in this case 0 > > bytes instead of 2. > > > > The rest of the errors are all similar to these. > > > >> Looking forward to hear from you! Best regards, Marek > > Bear in mind that the problem could lie in the test computer rather > > than in the device! > > > > The traces above were produced when 2 (problem) devices were connected, > one of them running testusb on UST-Host, connected to the other device > with g_zero module loaded. > When I run testusb on my notebook, still connected to one of the > devices, then I get the same results from TEST 10 except that > > /dev/bus/usb/001/003 test 10 --> 121 (Remote I/O error) > > never appears. There are only > > /dev/bus/usb/001/003 test 10 --> 32 (Broken pipe) > > messages in this case... can one possibly conclude something from this > fact...? Well, it suggests that the 121 errors were related to a problem in the host controller driver or hardware. And that the 32 errors are related to a problem in the device controller driver or hardware. > The devices I am speaking about have an Atmel at91 sam9260 controller. > Anyways, if you have any idea of any possibilities what could be wrong > to produce such test results... that would be great help! Not knowing anything about that hardware, my insights are extremely limited (to put it mildly). :-) Have you tried using a USB bus analyzer to record the signals on the wire? > Many thanks!!! Marek Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html