Re: [PATCH 0/4 v5] xHCI: PM patchset

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jul 28, 2010 at 10:39:44AM -0400, Alan Stern wrote:
> On Tue, 27 Jul 2010, Sarah Sharp wrote:
> 
> > On Tue, Jul 27, 2010 at 11:14:39AM -0400, Alan Stern wrote:
> > > On Tue, 27 Jul 2010, Xu, Andiry wrote:
> > > 
> > > > > That's not quite right.  It tries to reset the device first.  Only if
> > > > > the reset fails does it free and re-initialize the device.
> > > > > 
> > > > 
> > > > Yes, you are right, I didn't make it clearly. The hub driver will try to
> > > > reset the device first. After several failures, it will assume the
> > > > device is gone, then free and re-initialize the device.
> > > > 
> > > > The hub driver will call hcd->driver->reset_device() to reset the
> > > > device. Reset Device is a xHCI command to put a device in addressed or
> > > > configured state into default state. However since the xHC is reset, the
> > > > reset command will fail - all the slots are freed and disabled after xHC
> > > > reset. The command failure will show "cannot reset HCD device state"
> > > > messages.
> > > 
> > > Why isn't the device already in the addressed state?  Doesn't the xHC 
> > > controller detect the device and initialize it right after the xHC 
> > > reset completes?  Is the problem just that there wasn't enough time?
> > 
> > After the host controller is reset after the host resume fails, all the
> > state it has about devices is gone.  The xHCI host controller needs to
> > be told to re-address each device (basically to go through the
> > enumeration sequence again so it can get a slot ID and address the
> > device).
> > 
> > If xhci_reset_device() is called for a device that the xHC host doesn't
> > know about, the call will fail because the usb_device's slot_id field is
> > set to zero.  Now, xhci_reset_device() could be changed to return
> > success if the slot_id is set to zero.  I'm not sure if that would be
> > good for other error cases.
> 
> It looks like the problem is the special-case test in hub_port_init().  
> It skips the reset if the device is new and is running at SuperSpeed.  
> Clearly this is not the right criterion.

What I wanted to do was skip the port reset when a USB 3.0 device is
first plugged into the system.  When the device undergoes link training,
it has already gone through a warm port reset, so there's no need to
reset the device twice.  Now, that's just an optimization, and it
doesn't really matter to me if USB 3.0 devices get reset during an
initial connection.

I guess I don't understand what the issue is with the special case in
hub_port_init().  I thought the real issue was that the xHCI host
controller is has been reset and has lost its memory about any devices
that the USB core has allocated.

> I think the test should be moved down into xhci-hcd, which knows more
> about the true situation, and it should be fixed.  What really matters 
> is whether the hardware already knows about the device, not whether the 
> device is new.  (In fact, it is possible to have new devices that the 
> hardware _does_ know about, and it is possible to have old devices that 
> the hardware _doesn't_ know about.)  What should happen, if the device 
> isn't already known, is that the hardware should be told to discover 
> it.
> 
> And finally, what about USB-2.0 devices?  Do they have the restrictions
> regarding resets?  I imagine not, since they already work under xHCI
> with the current code.

USB 2.0 devices have a similar issue.  If the xHCI hardware doesn't know
about them (i.e. the usb_device structure is not allocated), then
xhci_reset_device() will return an error status.

The initialization sequence from the xHCI driver's perspective is like
this:  When usb_alloc_dev() is called and the device is under xHCI, then
xhci_alloc_dev() is called, and the device is issued a slot ID by the
xHC host.  When hub_set_address() is called, xhci_address_device() is
called.

xhci_address_device() sets up the device slot with a default control
endpoint ring and some information about device speed and parent hub is
stored in the device context.  Then it issues a command to the xHCI host
controller to set the address for the device and add the control
endpoint to the host controller's schedule.  Until this command
completes, a call to xhci_reset_device() will fail.  This initialization
happens for all speeds of devices.

So I think the issue that Andiry is running into is that the xHCI host
controller is reset, and it loses all the device context state.  (I'm
not sure if it's leaking memory at that point, Andiry may want to turn
on kmemleak to check.)  The USB core still thinks the device exists,
since the usb_device structure is still present, and so it tries to call
xhci_reset_device() after the failed resume.

> > > > After check, I think I may found the difference between EHCI and xHCI in
> > > > device address choosing. In EHCI, the hub driver calls choose_address()
> > > > to set the device address. Choose_address() uses find_next_zero_bit() to
> > > > find an available address. So the devnum will increase every time,
> > > > because bus->devnum_next is increased every time. However, xHCI does not
> > > > call choose_address(). The USB device address is assigned by H/W, upon a
> > > > successful address device command. xHCI driver then set the devnum to
> > > > the value xHC returns, so it will always be the same value.
> > > 
> > > I thought that might be the problem.  The kernel can't handle addresses 
> > > that are chosen by the hardware; it wants to choose the address itself.  
> > > xhci-hcd should set the address to the value specified by the kernel 
> > > instead of telling the kernel what address the hardware used.
> > 
> > So, you're saying that the xHCI driver should have an internal field for
> > the real device address, and allow the kernel to think it assigned a
> > different address?
> 
> No.  I'm saying that xhci-hcd's address_device callback should actually
> change the device's address to the value assigned by the kernel.
> 
> Hmm, the Wireless USB core also seems to have its own ideas about
> addressing devices.  It wouldn't be surprising if they end up causing
> trouble too.

I don't have a way to ask the xHCI host controller to issue a particular
address to a device.  The problem is that the hardware supports
per-device virtualization, and guests must not be allowed to pick an
address that conflicts with another guest.  So the hardware picks the
address during the initial call into the xHCI driver's
xhci_address_device() in hub_set_address().

What's worse is that the hardware checks the control endpoint for a
SetAddress command, and doesn't allow software to manually set the
address for the device.  See section 4.6.5:

"If the xHC detects a SET_ADDRESS request on the Default Control
Endpoint Transfer Ring, it shall generate a TRB Error Completion Status
for the TD. The xHC shall never forward a SET_ADDRESS request on a
Default Control Endpoint Transfer Ring to a USB device."

But I think there's a deeper issue with the failed host controller
resume.  I think we're seeing this char file conflict messages because
there is a difference in what the USB core and the xHCI host think the
device state is.

I think what needs to happen in that case is to tell the USB core that
all devices under the host have disconnected, and then force them to be
re-enumerated.  Or even possibly re-initialize the host controller,
since the event ring pointers and such may be lost.  Andiry, what do you
think?

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux