Re: xHCI host dies on device unplug

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Dec 19, 2022 at 02:25:46PM +0200, Mathias Nyman wrote:
> On 16.12.2022 23.32, Ladislav Michl wrote:
> > On Fri, Dec 16, 2022 at 12:13:23PM +0200, Mathias Nyman wrote:
> > > On 15.12.2022 18.12, Ladislav Michl wrote:
> > > > +Cc Mathias as he last touched this code path and may know more :)
> > > > 
> > > > On Tue, Dec 06, 2022 at 02:17:08PM +0100, Ladislav Michl wrote:
> > > > > On Mon, Dec 05, 2022 at 10:27:57PM +0100, Ladislav Michl wrote:
> > > > > > I'm running current linux.git on custom Marvell OCTEON III CN7020
> > > > > > based board. USB devices like FTDI (idVendor=0403, idProduct=6001,
> > > > > > bcdDevice= 6.00) Realtek WiFi dongle (idVendor=0bda, idProduct=8179,
> > > > > > bcdDevice= 0.00) works without issues, while Ralink WiFi dongle
> > > > > > (idVendor=148f, idProduct=5370, bcdDevice= 1.01) kills the host on
> > > > > > disconnect:
> > > > > > xhci-hcd xhci-hcd.0.auto: xHCI host not responding to stop endpoint command
> > > > > > xhci-hcd xhci-hcd.0.auto: xHCI host controller not responding, assume dead
> > > > > > xhci-hcd xhci-hcd.0.auto: HC died; cleaning up
> > > > > > 
> > > > > > Unfortunately I do not have a datasheet for CN7020 SoC, so it is hard
> > > > > > to tell if there is any errata :/ In case anyone see a clue in debug
> > > > > > logs bellow, I'll happily give it a try.
> > > > > 
> > > > > So I do have datasheet now. As a wild guess I tried to use dlmc_ref_clk0
> > > > > instead of dlmc_ref_clk1 as a refclk-type-ss and it fixed unplug death.
> > > > > I have no clue why, but anyway - sorry for the noise :) Perhaps Octeon's
> > > > > clock init is worth to be verified...
> > > > 
> > > > After all whenever xhci dies with "xHCI host not responding to stop endpoint
> > > > command" depends also on temperature, so there seems to be race somewhere.
> > > > 
> > > > As a quick and dirty verification, whenever xhci really died, following patch
> > > > was tested and it fixed issue. It just treats ep as if stop endpoint command
> > > > succeeded. Any clues? I'll happily provide more traces.
> > > 
> > > It's possible the controller did complete the stop endpoint command but driver
> > > didn't get the interrupt for the event for some reason.
> > > 
> 
> Looks like controller didn't complete the stop endpoint command.
> 
> Event for last completed command (before cycle bit change "c" -> "C") was:
>   0x00000000028f55a0: TRB 00000000035e81a0 status 'Success' len 0 slot 1 ep 0 type 'Command Completion Event' flags e:c,
> 
> This was for command at 35e81a0, which in the command ring was:
>   0x00000000035e81a0: Reset Endpoint Command: ctx 0000000000000000 slot 1 ep 3 flags T:c
> 
> The stop endpoint command was the next command queued, at 35e81b0:
>   0x00000000035e81b0: Stop Ring Command: slot 1 sp 0 ep 3 flags c
> 
> There were a lot of URBs queued for this device, and they are cancelled one by one after disconnect.
> 
> Was this the only device connected? If so does connecting another usb device to another root port help?
> Just to test if the host for some reason partially stops a while after last device disconnect?

Device is connected directly into SoC. Once connected into HUB, host doesn't die
(as noted in other email, sorry for not replying to my own message, so it got lost)
It seems as intentional (power management?) optimization. If another device is
plugged in before 5 sec timeout expires, host completes stop endpoint command.

Unfortunately I cannot find anything describing this behavior in
documentation, so I'll ask manufacturer support.

Both solutions, do nothing or reset controller once last device is unpluged
works, but I doubt they are suitable for mainline kernel without further
investigation.

> Another thing is that the stop endpoint command fails after three soft reset tries,
> does disabling soft reset help?

No, this does not cause any change.

	ladis



[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux