Hi Alan,
First of all, many thanks for the reply! Please read my comments below...
On 22.05.2017 21:46, Alan Stern wrote:
On Mon, 22 May 2017, Marek Pikarski wrote:
Hi!
I am currently hunting the source of an issue that we have with an USB
modem. There could even be some HW / board design issue.
My idea was to try the kernel USB unit tests and this seems to work out
as it gave me some interesting results!
It would be so great if someone on this list had an idea what the test
results mean and whether these results are pointing to a certain problem...
Here the interesting part (TEST 10 is the one always failing) of the
testusb output:
# ./testusb -l -D /dev/bus/usb/001/003 -t 10
unknown speed /dev/bus/usb/001/003 0
/dev/bus/usb/001/003 test 10 --> 32 (Broken pipe)
/dev/bus/usb/001/003 test 10, 8.011131 secs
/dev/bus/usb/001/003 test 10, 8.010904 secs
/dev/bus/usb/001/003 test 10 --> 32 (Broken pipe)
/dev/bus/usb/001/003 test 10 --> 121 (Remote I/O error)
/dev/bus/usb/001/003 test 10 --> 32 (Broken pipe)
/dev/bus/usb/001/003 test 10 --> 32 (Broken pipe)
/dev/bus/usb/001/003 test 10 --> 32 (Broken pipe)
/dev/bus/usb/001/003 test 10 --> 32 (Broken pipe)
/dev/bus/usb/001/003 test 10, 8.012122 secs
/dev/bus/usb/001/003 test 10 --> 32 (Broken pipe)
...
Test 10 tests how well control messages are handled when they are
queued up. It has a large number of subcases that get tried
iteratively, and the whole thing is repeated, in this case 1000 times.
Do you possibly have any idea why its only TEST 10 which is failing?
Compared to the other tests, does TEST 10 (queued control messages)
require some special treatment by the HW or USB-bus or kernel or something?
Maybe you could even images what could be the reason for this? Like
USB-bus clock or speed?
And here the corresponding info from the usbtest kernel module (dmesg
output):
[15298.990000] usbtest 1-2:3.0: TEST 10: queue 32 control calls, 1000 times
[15300.330000] usbtest 1-2:3.0: subtest 4 error, status -32
[15300.330000] usbtest 1-2:3.0: control queue 80.00, err -32, 26667
left, subcase 4, len 0/2
Subtest 4 is a Get-Device-Status control request. It is not expected
to fail. The log indicates that the device sent a STALL instead of a
proper reply -- that's what the -32 means. However, the subtest did
complete correctly over a hundred times before the failure occurred.
[15300.350000] usbtest 1-2:3.0: subcase 5 completed out of order, last 3
This means subtest 3 was the one that completed previously. This
should not happen -- requests are always supposed to complete in order,
so subtest 4 should have completed before 5.
[15300.350000] usbtest 1-2:3.0: subcase 4 completed out of order, last 4
That's even weirder. I don't see how that could happen. But the
error pathways in this driver have not been tested thoroughly.
I also cannot understand why this can happen... Do you know at which
level this kind of Queue for these messages is implemented, is it kernel
driver or maybe device firmware?
[15300.370000] usbtest 1-2:3.0: TEST 10: queue 32 control calls, 1000 times
[15308.380000] usbtest 1-2:3.0: TEST 10: queue 32 control calls, 1000 times
[15316.390000] usbtest 1-2:3.0: TEST 10: queue 32 control calls, 1000 times
[15322.230000] usbtest 1-2:3.0: subtest 4 error, status -32
[15322.230000] usbtest 1-2:3.0: control queue 80.00, err -32, 8715 left,
subcase 4, len 0/2
[15322.240000] usbtest 1-2:3.0: subcase 5 completed out of order, last 3
[15322.250000] usbtest 1-2:3.0: subcase 4 completed out of order, last 4
[15322.260000] usbtest 1-2:3.0: TEST 10: queue 32 control calls, 1000 times
[15324.870000] usbtest 1-2:3.0: subtest 9 error, status -121
[15324.870000] usbtest 1-2:3.0: control queue 82.00, err -121, 21574
left, subcase 9, len 0/2
Error code -121 means a short reply was received -- in this case 0
bytes instead of 2.
The rest of the errors are all similar to these.
Looking forward to hear from you! Best regards, Marek
Bear in mind that the problem could lie in the test computer rather
than in the device!
The traces above were produced when 2 (problem) devices were connected,
one of them running testusb on UST-Host, connected to the other device
with g_zero module loaded.
When I run testusb on my notebook, still connected to one of the
devices, then I get the same results from TEST 10 except that
/dev/bus/usb/001/003 test 10 --> 121 (Remote I/O error)
never appears. There are only
/dev/bus/usb/001/003 test 10 --> 32 (Broken pipe)
messages in this case... can one possibly conclude something from this
fact...?
The devices I am speaking about have an Atmel at91 sam9260 controller.
Anyways, if you have any idea of any possibilities what could be wrong
to produce such test results... that would be great help!
Many thanks!!! Marek
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html