Re: [RFT 4/5] xhci: Restore event ring dequeue pointer on resume.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Mar 23, 2012 at 02:11:10PM +0800, Elric Fu wrote:
> 2012/3/23 Sarah Sharp <sarah.a.sharp@xxxxxxxxxxxxxxx>:
> > On Wed, Mar 21, 2012 at 11:01:12AM +0800, Elric Fu wrote:
> >> 2012/3/21 Sarah Sharp <sarah.a.sharp@xxxxxxxxxxxxxxx>:
> >> > On Tue, Mar 20, 2012 at 12:44:03PM +0800, Elric Fu wrote:
> >> >> 2012/3/17 Sarah Sharp <sarah.a.sharp@xxxxxxxxxxxxxxx>:
> >> > If the USB 2.0 hub really isn't responding to the Set Address command,
> >> > there's not much we can do about it.  I think there is one xHCI driver
> >> > bug where it won't issue a command abort when the Set Address command
> >> > times out, but you wouldn't see any other xHCI commands work after that
> >> > happens.  You would also see messages like:
> >> >
> >> >                xhci_warn(xhci, "%s while waiting for address device command\n",
> >> >                                timeleft == 0 ? "Timeout" : "Signal");
> >> >
> >> > There's some trickiness to the solution to the set address timeout
> >> > handling (you need a flag to make it so the driver doesn't ring the
> >> > command ring doorbell or issue a new command abort until you've received
> >> > confirmation of the first canceled command), so I haven't had time to
> >> > get around to it yet.  However, if you're not seeing the Timeout
> >> > messages, the device probably is responding to the control transfer.
> >> >
> >>
> >> You're right. I did't find any message about timeout.
> >>
> >> Actually, I have found the xHCI driver has the bug when the address device
> >> command timeout before. According to the xHCI spec, when the address
> >> device command timeout is happened, the driver have to abort the command
> >> ring and wait the event about the command ring stopped. Otherwise, the
> >> command ring will wait the response from the device all along. I am waiting for
> >> a special device to duplicate the issue and confirm the solution is correct.
> >
> > Yes, you're correct that the xHCI driver is supposed to issue a command
> > abort when the set address command times out.  Does your solution
> > include making sure that another command submission doesn't ring the
> > doorbell?
> 
> No, is it necessary? My solution was used to handle the address device
> command timeout. You know when the address device command timeout
> occur, the command ring will hang up and wait for the response from
> device unless driver abort the command ring. Driver can keep on submitting
> command TRB, and the host will handle the command TRB again until
> ringing the doorbell after the command ring is stopped.
> 
> Even if another command submission ring the doorbell when command ring
> is aborted, the command ring will continue handling those command TRB
> after it is restarted.

Yes, I think some sort of flag or command abort list is necessary.
First, there's no guarantee that the command won't complete while we're
in the middle of issuing the command abort.  So there needs to be a way
to coordinate that case in the driver.  Second, what if multiple
commands have timed out because the current command was stuck?  We need
to be able to handle that case as well, and we probably need some flags
there.

Further, we really don't want the hardware to touch the command
structures after we've attempted to abort the command (I'm mostly
thinking of commands other than the Set Address command).  So I think we
really want to stop the command with the command abort flag, and write a
no-op command to the command TRB that we're trying to cancel, just like
we do for canceled TRBs in the endpoint ring.

Basically, it's much more complicated than just issuing a command abort
when the set address times out.  I'd really rather think about the
implications and get the solution right the first time than have a quick
hack in the driver to work around a buggy device.

> > Feel free to send your patch before you get your special device.  I have
> > an old VIA USB 3.0 hub I can test the set address timeout on.  It stops
> > responding to set addresses after several disconnect/connect cycles.
> >
> 
> Hmm. As I mentioned above, those patched was just used to handle address
> device command timeout and tested by our customer. If you wanna check it
> out, I need to verify them carefully first to prevent them generating other
> problems. Maybe I need to perfect it for handling all cases.

IMO, it's better to send a partially completed patch and get the
conversation rolling than to get it "perfect" and then have to redo it
when some points out some basic flaw.  That's the open source way of
doing patches.  We get something mostly working, send it to the
community for feedback, integrate the feedback, and resubmit the patch
series.

> >> > Maybe you need to ask the firmware team if there's anything special that
> >> > needs to be done to initialize the integrated USB 2.0 hub after a resume
> >> > from suspend?  Maybe it needs a Get Descriptor control transfer before
> >> > the Set Address control transfer?  Or it could be something completely
> >> > different.  At this point, I'm not really sure how to fix the issue with
> >> > the USB 2.0 hub.
> >>
> >> Maybe I need to determine the port link state of the port attached by the
> >> integrated usb 2.0 hub. I worry that the port link state is transition to other
> >> states except for U0 after the integrated 2.0 hub resume from suspend.
> >
> > But shouldn't the xHCI host controller report that port state?  Or do
> > you think it's a host bug as well?
> 
> Yes. I think it is host issue. Actually, I have found some strange response
> from integrated hub. I need to do more tries.

Ok, let me know what you find.  I'm inclined to hold off on the reset
resume quirk until we know what's actually going on.  I can't send
patches to Greg until next week anyway.

> >> I agree with your opinions. So we have to add a XHCI_RESET_ON_RESUME
> >> quirk for VIA host before the suspend operation is finished.
> >
> > I'm a bit confused.  Are you saying that the USB 2.0 hub comes back fine
> > (answers to a set address) if we ignore the register restore step for S3
> > resume and just halt and reset the host controller?  Meaning that the
> > VIA host needs your original XHCI_RESET_ON_RESUME quirk patch on top of
> > patch 2?
> 
> Yes. If driver re-initialize the host in suspend operation, the usb
> 2.0 hub comes
> back fine and all devices attached to it works fine.
> 
> >
> > If so, can you refine your patch to only set the reset resume for the
> > specific revisions of VIA hosts that have this issue?  I'd like to give
> > VIA the chance to fix this issue in their future chip revisions.
> 
> OK, I will modify it and send it again. In additional, do you have any
> comment about the patch that print the VIA host firmware version?
> if you think maybe it is useful sometimes, perhaps I can add the
> XHCI_RESET_ON_RESUME quirk in it.

I think it's useful, but I would rather the firmware printing be a
completely separate patch on top of the quirk.  Then we can send the
reset resume quirk patch off to the stable tree, and have the firmware
printing patch go into the 3.5 queue.

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux