Re: [RFT 4/5] xhci: Restore event ring dequeue pointer on resume.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



2012/3/24 Sarah Sharp <sarah.a.sharp@xxxxxxxxxxxxxxx>:
> On Fri, Mar 23, 2012 at 02:11:10PM +0800, Elric Fu wrote:
>> 2012/3/23 Sarah Sharp <sarah.a.sharp@xxxxxxxxxxxxxxx>:
>> > On Wed, Mar 21, 2012 at 11:01:12AM +0800, Elric Fu wrote:
>> >> 2012/3/21 Sarah Sharp <sarah.a.sharp@xxxxxxxxxxxxxxx>:
>> >> > On Tue, Mar 20, 2012 at 12:44:03PM +0800, Elric Fu wrote:
>> >> >> 2012/3/17 Sarah Sharp <sarah.a.sharp@xxxxxxxxxxxxxxx>:
>> >> > If the USB 2.0 hub really isn't responding to the Set Address command,
>> >> > there's not much we can do about it.  I think there is one xHCI driver
>> >> > bug where it won't issue a command abort when the Set Address command
>> >> > times out, but you wouldn't see any other xHCI commands work after that
>> >> > happens.  You would also see messages like:
>> >> >
>> >> >                xhci_warn(xhci, "%s while waiting for address device command\n",
>> >> >                                timeleft == 0 ? "Timeout" : "Signal");
>> >> >
>> >> > There's some trickiness to the solution to the set address timeout
>> >> > handling (you need a flag to make it so the driver doesn't ring the
>> >> > command ring doorbell or issue a new command abort until you've received
>> >> > confirmation of the first canceled command), so I haven't had time to
>> >> > get around to it yet.  However, if you're not seeing the Timeout
>> >> > messages, the device probably is responding to the control transfer.
>> >> >
>> >>
>> >> You're right. I did't find any message about timeout.
>> >>
>> >> Actually, I have found the xHCI driver has the bug when the address device
>> >> command timeout before. According to the xHCI spec, when the address
>> >> device command timeout is happened, the driver have to abort the command
>> >> ring and wait the event about the command ring stopped. Otherwise, the
>> >> command ring will wait the response from the device all along. I am waiting for
>> >> a special device to duplicate the issue and confirm the solution is correct.
>> >
>> > Yes, you're correct that the xHCI driver is supposed to issue a command
>> > abort when the set address command times out.  Does your solution
>> > include making sure that another command submission doesn't ring the
>> > doorbell?
>>
>> No, is it necessary? My solution was used to handle the address device
>> command timeout. You know when the address device command timeout
>> occur, the command ring will hang up and wait for the response from
>> device unless driver abort the command ring. Driver can keep on submitting
>> command TRB, and the host will handle the command TRB again until
>> ringing the doorbell after the command ring is stopped.
>>
>> Even if another command submission ring the doorbell when command ring
>> is aborted, the command ring will continue handling those command TRB
>> after it is restarted.
>
> Yes, I think some sort of flag or command abort list is necessary.
> First, there's no guarantee that the command won't complete while we're
> in the middle of issuing the command abort.  So there needs to be a way
> to coordinate that case in the driver.  Second, what if multiple
> commands have timed out because the current command was stuck?  We need
> to be able to handle that case as well, and we probably need some flags
> there.
>
> Further, we really don't want the hardware to touch the command
> structures after we've attempted to abort the command (I'm mostly
> thinking of commands other than the Set Address command).  So I think we
> really want to stop the command with the command abort flag, and write a
> no-op command to the command TRB that we're trying to cancel, just like
> we do for canceled TRBs in the endpoint ring.
>
> Basically, it's much more complicated than just issuing a command abort
> when the set address times out.  I'd really rather think about the
> implications and get the solution right the first time than have a quick
> hack in the driver to work around a buggy device.

Got it. I will do it and send them next week.

>
>> > Feel free to send your patch before you get your special device.  I have
>> > an old VIA USB 3.0 hub I can test the set address timeout on.  It stops
>> > responding to set addresses after several disconnect/connect cycles.
>> >
>>
>> Hmm. As I mentioned above, those patched was just used to handle address
>> device command timeout and tested by our customer. If you wanna check it
>> out, I need to verify them carefully first to prevent them generating other
>> problems. Maybe I need to perfect it for handling all cases.
>
> IMO, it's better to send a partially completed patch and get the
> conversation rolling than to get it "perfect" and then have to redo it
> when some points out some basic flaw.  That's the open source way of
> doing patches.  We get something mostly working, send it to the
> community for feedback, integrate the feedback, and resubmit the patch
> series.

Yeah, sorry, I miss the point.

>
>> >> > Maybe you need to ask the firmware team if there's anything special that
>> >> > needs to be done to initialize the integrated USB 2.0 hub after a resume
>> >> > from suspend?  Maybe it needs a Get Descriptor control transfer before
>> >> > the Set Address control transfer?  Or it could be something completely
>> >> > different.  At this point, I'm not really sure how to fix the issue with
>> >> > the USB 2.0 hub.
>> >>
>> >> Maybe I need to determine the port link state of the port attached by the
>> >> integrated usb 2.0 hub. I worry that the port link state is transition to other
>> >> states except for U0 after the integrated 2.0 hub resume from suspend.
>> >
>> > But shouldn't the xHCI host controller report that port state?  Or do
>> > you think it's a host bug as well?
>>
>> Yes. I think it is host issue. Actually, I have found some strange response
>> from integrated hub. I need to do more tries.
>
> Ok, let me know what you find.  I'm inclined to hold off on the reset
> resume quirk until we know what's actually going on.  I can't send
> patches to Greg until next week anyway.
>
>> >> I agree with your opinions. So we have to add a XHCI_RESET_ON_RESUME
>> >> quirk for VIA host before the suspend operation is finished.
>> >
>> > I'm a bit confused.  Are you saying that the USB 2.0 hub comes back fine
>> > (answers to a set address) if we ignore the register restore step for S3
>> > resume and just halt and reset the host controller?  Meaning that the
>> > VIA host needs your original XHCI_RESET_ON_RESUME quirk patch on top of
>> > patch 2?
>>
>> Yes. If driver re-initialize the host in suspend operation, the usb
>> 2.0 hub comes
>> back fine and all devices attached to it works fine.
>>
>> >
>> > If so, can you refine your patch to only set the reset resume for the
>> > specific revisions of VIA hosts that have this issue?  I'd like to give
>> > VIA the chance to fix this issue in their future chip revisions.
>>
>> OK, I will modify it and send it again. In additional, do you have any
>> comment about the patch that print the VIA host firmware version?
>> if you think maybe it is useful sometimes, perhaps I can add the
>> XHCI_RESET_ON_RESUME quirk in it.
>
> I think it's useful, but I would rather the firmware printing be a
> completely separate patch on top of the quirk.  Then we can send the
> reset resume quirk patch off to the stable tree, and have the firmware
> printing patch go into the 3.5 queue.

I see. I will send the v2 patch just has the XHCI_RESET_ON_RESUME
quirk next Monday. Thanks.

Best Regards,
Elric Fu

>
> Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux