On 31.7.2019 12.18, Enric Balletbo Serra wrote:
Hi Mathias,
Thanks to look into this.
Missatge de Mathias Nyman <mathias.nyman@xxxxxxxxxxxxxxx> del dia dt.,
30 de jul. 2019 a les 21:39:
On 27.7.2019 23.43, Bob Gleitsmann wrote:
OK, here's the result of the bisection:
ef513be0a9057cc6baf5d29566aaaefa214ba344 is the first bad commit
commit ef513be0a9057cc6baf5d29566aaaefa214ba344
Author: Jim Lin <jilin@xxxxxxxxxx>
Date:???? Mon Jun 3 18:53:44 2019 +0800
?????? usb: xhci: Add Clear_TT_Buffer
??????
?????? USB 2.0 specification chapter 11.17.5 says "as part of endpoint halt
?????? processing for full-/low-speed endpoints connected via a TT, the host
?????? software must use the Clear_TT_Buffer request to the TT to ensure
?????? that the buffer is not in the busy state".
??????
?????? In our case, a full-speed speaker (ConferenceCam) is behind a high-
?????? speed hub (ConferenceCam Connect), sometimes once we get STALL on a
?????? request we may continue to get STALL with the folllowing requests,
?????? like Set_Interface.
??????
?????? Here we invoke usb_hub_clear_tt_buffer() to send Clear_TT_Buffer
?????? request to the hub of the device for the following Set_Interface
?????? requests to the device to get ACK successfully.
??????
?????? Signed-off-by: Jim Lin <jilin@xxxxxxxxxx>
?????? Acked-by: Mathias Nyman <mathias.nyman@xxxxxxxxxxxxxxx>
?????? Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
??drivers/usb/host/xhci-ring.c | 27 ++++++++++++++++++++++++++-
??drivers/usb/host/xhci.c?????????? | 21 +++++++++++++++++++++
??drivers/usb/host/xhci.h?????????? |?? 5 +++++
??3 files changed, 52 insertions(+), 1 deletion(-)
Thanks, a quick look doesn't immediately open up the cause to me.
Most likely an endpoint or struct usb_device got dropped and freed at suspend/resume,
but we probably have some old stale pointer still in a a TD or URB to it.
could you apply the hack below, it should show more details about this issue.
grep for "Mathias" after resume, if you find it we just prevented a crash.
With the below patch the oops disappears and the reason is
root@debian:~# dmesg | grep "Mathias"
[ 67.747933] xhci-hcd xhci-hcd.8.auto: Mathias: No vdev for slot id 0
Ok, thanks,
When we free the xhci virt_dev the udev->slot_is set to zero as well.
Looks like whole xHCI was reset are resume:
[ 2994.539409] usb usb8: root hub lost power or was reset
[ 2994.539411] usb usb9: root hub lost power or was reset
This means that xHC controller was reset and xhci driver re-allocated everything.
It makes sense to check that xhci virt_device exists in the endpoint reset callback.
This will fix the oops, but I'm still missing the big picture, how we ended up here.
Would it be possible for you to take traces and logs with the previous patch that prevents
the oops, but shows the "Mathias: No vdev for slot id 0" message?
-Mathias