Hi Alan, Greg, everyone,
On 04. 08. 23 21:09, Alan Stern wrote:
An outstanding syzbot bug report has been traced to a race between the
routine that reads in the device descriptor for a device being
reinitialized and the routine that writes the descriptors to a sysfs
attribute file. The problem is that reinitializing a device, like
initializing it for the first time, stores the device descriptor
directly in the usb_device structure, where it may be accessed
concurrently as part of sending the descriptors to the sysfs reader.
I have a suspicion that some of these patches (three from the original
series, plus the "Fix oversight..." one) introduced a regression we see
with some USB devices in Home Assistant OS (but in mainstream distro as
well, see below). In particular it's Z-Wave.me UZB stick (0658:0200),
however roughly at the time of introduction of these patches, we started
to see a few more reports of issues with USB devices (in general radios
for IoT protocols), so I can't rule out it's source of more regressions.
For this particular device, we have most detailed tracing of the issue,
confirming it also manifests on mainstream distribution (Debian) which
included these patches in its kernel. Most issue reports come from RPi 3
but we also got them on amd64, and both on HAOS and Debian.
I'm a layman in terms of the USB stack, so I might be wrong about some
assumptions, but anyway, the device seemed to always misbehave due to
poor HW (?) implementation - every time it's plugged into an USB slot,
the following messages appear:
[ 1134.073005] usb 1-1.4: new full-speed USB device number 12 using dwc_otg
[ 1134.153006] usb 1-1.4: device descriptor read/64, error -32
[ 1134.341003] usb 1-1.4: device descriptor read/64, error -32
[ 1134.529004] usb 1-1.4: new full-speed USB device number 13 using dwc_otg
[ 1134.609063] usb 1-1.4: device descriptor read/64, error -32
[ 1134.797005] usb 1-1.4: device descriptor read/64, error -32
[ 1134.905181] usb 1-1-port4: attempt power cycle
However, kernel versions prior to 6.1.52, or 6.1.73 with these patches
reverted, were able to recover:
[ 1135.717049] usb 1-1.4: new full-speed USB device number 14 using dwc_otg
[ 1135.741234] usb 1-1.4: New USB device found, idVendor=0658,
idProduct=0200, bcdDevice= 0.00
[ 1135.741275] usb 1-1.4: New USB device strings: Mfr=0, Product=0,
SerialNumber=0
[ 1135.743959] cdc_acm 1-1.4:1.0: ttyACM0: USB ACM device
Without these patches reverted, 6.1.73 goes 2 another rounds of device
descriptor read errors, and ends with:
[ 263.705865] usb 1-1-port4: unable to enumerate USB device
Also it should be noted that it seems that this only happens on USB 2
ports, on USB 3/SS ports, the descriptor read errors are "protocol
error" instead of "broken pipe", and the driver recovers (realizing
this, I am now finally able to reproduce the issue in my environment):
[ 38.244292] usb 2-3: new full-speed USB device number 3 using xhci_hcd
[ 38.372319] usb 2-3: device descriptor read/64, error -71
[ 38.608317] usb 2-3: device descriptor read/64, error -71
[ 38.844287] usb 2-3: new full-speed USB device number 4 using xhci_hcd
[ 38.972317] usb 2-3: device descriptor read/64, error -71
[ 39.208325] usb 2-3: device descriptor read/64, error -71
[ 39.316405] usb usb2-port3: attempt power cycle
[ 39.936295] usb 2-3: new full-speed USB device number 5 using xhci_hcd
[ 39.957228] usb 2-3: New USB device found, idVendor=0658,
idProduct=0200, bcdDevice= 0.00
[ 39.957241] usb 2-3: New USB device strings: Mfr=0, Product=0,
SerialNumber=0
[ 39.999591] cdc_acm 2-3:1.0: ttyACM0: USB ACM device
[ 39.999639] usbcore: registered new interface driver cdc_acm
[ 39.999641] cdc_acm: USB Abstract Control Model driver for USB modems
and ISDN adapters
This is the gist of the problem, more detailed findings can be found in
reports by @FredrikFornstad in the GH issue [1], who managed to
reproduce and pinpoint the likely source of the problem.
Let me know if you need any more details, or if there's something more
to try, I'll be happy to help with getting this resolved.
Thanks,
Jan
[1]
https://github.com/home-assistant/operating-system/issues/2995#issuecomment-1973507518