Re: USB device disconnects on resume

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am Mittwoch, dem 20.04.2022 um 10:40 -0400 schrieb Alan Stern:
> On Wed, Apr 20, 2022 at 12:37:36PM +0200, Martin Kepplinger wrote:
> > Am Dienstag, dem 19.04.2022 um 10:32 -0400 schrieb Alan Stern:
> > > On Tue, Apr 19, 2022 at 11:28:21AM +0200, Martin Kepplinger
> > > wrote:
> > > > hi,
> > > > 
> > > > I'm seeing resets and re-enumerations on runtime-resume for one
> > > > device
> > > > a lot. It's a modem connected to the USB2642 Microchip (SMSC)
> > > > USB2
> > > > hub,
> > > > that's connected to an xhci HC.
> > > > 
> > > > A remote wakeup *sometimes* makes the hub say "physically
> > > > disconnected"
> > > > during resume in hub_activate(), and thus sets reset_resume.
> > > > Then
> > > > the
> > > > device comes up as low-speed device once, which again is not
> > > > allowed
> > > > during normal runtime resume, so would itself trigger a reset.
> > > 
> > > Does the reset-resume always fail in this way?
> > 
> > Resetting itself doesn't usually fail in the sense that a device
> > would
> > not work anymore after resetting. The problem is that the resets
> > happen
> > in the first place. 90+% of runtime-resumes are fine - auto- and
> > wakeup-resume. Resetting is a major problem though, imagine a modem
> > device being re-enumerated during a phone call or "realtime" data
> > connection. I see that a lot.
> 
> Okay, I see.
> 
> > Let me record what hub.c says when leading up to the reset of 1-1.2
> > (the modem), with logs of a normal runtime resume/suspend cycle
> > included before that, as reference:
> > 
> > 1650447001.174798 pureos kernel: usb 1-1: usb auto-resume
> > 1650447001.242810 pureos kernel: usb 1-1: Waited 0ms for CONNECT
> > 1650447001.247853 pureos kernel: usb 1-1: finish resume
> > 1650447001.249697 pureos kernel: hub 1-1:1.0: hub_resume
> > 1650447001.251409 pureos kernel: usb 1-1-port1: status 0507 change
> > 0000
> > 1650447001.251624 pureos kernel: usb 1-1-port2: status 0507 change
> > 0000
> > 1650447001.251793 pureos kernel: hub 1-1:1.0: state 7 ports 3 chg
> > 0000
> > evt 0000
> > 1650447001.253052 pureos kernel: usb 1-1.2: usb auto-resume
> 
> What is the cause of this runtime resume?  According to the port
> status 
> above, the 1-1.2 device did not send a wakeup request.

How would I find out? Recording via usbmon is next on my todo list.

> 
> > 1650447001.318845 pureos kernel: usb 1-1.2: Waited 0ms for CONNECT
> > 1650447001.324925 pureos kernel: usb 1-1.2: finish resume
> > 1650447003.831095 pureos kernel: usb 1-1.2: usb auto-suspend,
> > wakeup 1
> > 1650447003.854701 pureos kernel: hub 1-1:1.0: hub_suspend
> > 1650447003.874773 pureos kernel: usb 1-1: usb auto-suspend, wakeup
> > 1
> > 1650447003.922054 pureos kernel: usb 1-1: usb wakeup-resume
> 
> This wakeup occurred only 48 ms after the hub was runtime suspended. 
> But here at least the cause is evident: The hub sent a wakeup request
> because its child (the 1-1.2 modem) disconnected.

fwiw, that wakeup-resume *always* comes about 50 ms after the last
runtime suspend.

> 
> > 1650447003.942066 pureos kernel: usb 1-1: Waited 0ms for CONNECT
> > 1650447003.945755 pureos kernel: usb 1-1: finish resume
> > 1650447003.947589 pureos kernel: hub 1-1:1.0: hub_resume
> > 1650447003.949226 pureos kernel: usb 1-1-port1: status 0507 change
> > 0000
> > 1650447003.949430 pureos kernel: usb 1-1-port2: status 0101 change
> > 0005
> > 1650447004.058779 pureos kernel: hub 1-1:1.0: state 7 ports 3 chg
> > 0004
> > evt 0000
> > 1650447004.074089 pureos kernel: usb 1-1.2: usb wakeup-resume
> > 1650447004.094056 pureos kernel: usb 1-1.2: Waited 0ms for CONNECT
> > 1650447004.097255 pureos kernel: usb 1-1.2: finish reset-resume
> > 1650447004.182333 pureos kernel: usb 1-1.2: reset high-speed USB
> > device
> > number 5 using xhci-hcd
> > 1650447004.314425 pureos kernel: usb 1-1-port2: resume, status 0
> > 1650447004.317628 pureos kernel: usb 1-1-port2: status 0101, change
> > 0004, 12 Mb/s
> > 1650447004.318673 pureos kernel: usb 1-1.2: USB disconnect, device
> > number 5
> > 1650447004.323374 pureos kernel: usb 1-1.2: unregistering device
> 
> And it looks like in this case, the reset-resume failed.

Well, at least reset_resume has been set, which I want to avoid.

> 
> > So before resetting, the hub reads
> > "usb 1-1-port2: status 0101 change 0005" instead of normally
> > "usb 1-1-port2: status 0507 change 0000"
> > 
> > but I don't know why. That portstatus/portchange doesn't change
> > over
> > time when I just keep reading portstatus/portchange in
> > hub_activate()
> > in a loop.
> 
> You mean that if the port status and change values are originally 
> 0101 and 0005 in hub_activate(), they remain equal to those values? 
> And 
> similarly if they are originally 0507 and 0000?
> 
> That is to be expected.  Nothing happens to make those values change 
> until hub_activate() sends some commands to the hub.

I see.

Of course this doesn't make much sense, but just so you know: if I just
don't let hub_activate() set udev->reset_resume to 1, then
check_port_resume_type() will do so, and thus again
finish_port_resume() will reset the device by calling
usb_reset_and_verify_device().

> 
> > > > The Hub and device is permanently connected on the PCB, so the
> > > > hub
> > > > is
> > > > interpreting it in a wrong way.
> > > 
> > > What is the hub is interpreting in a wrong way?  Why should a
> > > permanent 
> > > connection on the PCB have anything to do with whether the resume
> > > signals are misinterpreted?
> > 
> > I only wanted to say that the device (modem in this case) cannot be
> > unplugged - there's no plug. That's all :)
> >      
> > https://elixir.bootlin.com/linux/latest/source/drivers/usb/core/hub.c#L1197
> > interprets my situation as a "removed" device.
> 
> What it means is that the modem was electronically disconnected from
> the 
> USB bus.  In theory this could be the result of a mixup in the resume
> signals, but it's more likely that the modem did this deliberately 
> because its firmware crashed.  (Why it should crash while it is 
> suspended is a good question, though...)

ok. assuming such a firmware bug, if I set a new quirk for the device,
do you think I can work around that (in hub.c?) in a way that userspace
doesn't really notice?

In theory, if I know this behaviour in advance, I think I should be
able to somehow wait until the device is ready again instead of
resetting.


> 
> > > >  I found an email that describes what I
> > > > see from Sarah Sharp in 2013 here:
> > > > https://marc.info/?l=linux-usb&m=137754385421825&w=2 Where she
> > > > says:
> > > > 
> > > > "Occasionally, the host controller was sending the SoFs too
> > > > soon on
> > > > resume, and the device would interpret it as a low-speed
> > > > chirp. 
> > > > The
> > > > device would disconnect, and transform from a high speed device
> > > > to
> > > > a
> > > > low speed device.  I don't think increasing the 10 ms time out
> > > > will
> > > > help at all in this case, but you did ask what USB device
> > > > disconnect
> > > > scenarios I've seen."
> > > 
> > > Read the following messages in that email thread.  Sarah said
> > > that
> > > she 
> > > would fix the SoF signal timing in xhci-hcd ("I agree that this
> > > seems
> > > like an xHCI driver issue, and I'll fix it in the driver").  I
> > > have
> > > no 
> > > idea whether this helped the faulty devices; my guess is that it
> > > didn't.
> > 
> > Do you know with what changes she tried to fix that?
> 
> No.  But you ought to be able to see by checking the history for that
> time period.
> 
> Alan Stern





[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux