Re: [PATCH 2/3 v2] xhci: Handle canceled URBs when HC dies.

Sarah Sharp <sarah.a.sharp@xxxxxxxxxxxxxxx> · Thu, 1 Oct 2009 16:30:41 -0700

On Tue, Sep 29, 2009 at 06:00:48PM -0400, Alan Stern wrote:
> On Tue, 29 Sep 2009, Sarah Sharp wrote:
> 
> > So yes, this is a problem.  I'll have to think more about how to
> > integrate a watchdog timer in the code for the case where the hardware
> > is not responding to a stop endpoint command.
> > 
> > > > If the timer expires the hcd would run through all queues and
> > > > complete (with an error) all urbs.
> > 
> > Should it cancel all the URBs?  The hardware could be completing URBs
> > for other endpoints in the system, since each endpoint has its own
> > transfer ring.  If the hardware isn't able to stop one endpoint in the
> > system but other endpoints are working, should we assume the HC is just
> > broken, halt and reset it, and complete all URBs with an error?
> 
> This is more or less up to you.  When the watchdog timer expires, you
> have to stop the hardware from accessing the URBs you're about to give
> back.  If you can do this without disturbing other URBs, then fine.  
> If not, every URB you are forced to abort should complete with a
> -ESHUTDOWN status.

The xHCI watchdog timer will run in interrupt context.  It looks like
the EHCI driver doesn't really care that the halting of the host can
take up to 2ms (16 microframes) in interrupt context.  I was going to be
nice and spawn a work queue to halt, reset, and giveback the URBs if the
stop endpoint command timed out.  Should I bother?  Or should I just do
all that work in the timer interrupt?

> > > David is right.  This isn't the way dead host controllers are handled 
> > > in the other HCDs.
> > > 
> > > The drivers test at some appropriate point for whether the controller
> > > is still connected and alive (generally from within the IRQ handler,
> > > the resume routine, and/or a watchdog timer routine).  If it isn't, the
> > > driver sets hcd->state to HC_STATE_HALT (to tell usbcore that the
> > > controller is dead), possibly calls usb_hc_died(), resets the
> > > controller hardware, and sets a private flag.
> > 
> > The documentation for usb_hc_died() says it's only for host controllers
> > attached to non-PCI buses, so I don't think that applies to xHCI.  I'll
> > rework this patch to set hcd->state to HC_STATE_HALT.

On second thought, why does the documentation say usb_hc_died() is only
necessary for non-PCI buses?  Is that because usb_hc_died() is called
from the PCI interrupt handler when HC_STATE_HALT is set?  If I set that
state in the watchdog timer function or spawned work queue, will the USB
core ever notice if that state is set if there are no more PCI
interrupts?  Or will I have to call usb_hc_died() from the xHCI driver?

> > I wish there was some documentation on when a host controller is
> > supposed to set hcd->state.  I mostly don't touch it at all, so are
> > there any other conditions where I should change it?
> 
> hcd->state is a big mess.  :-(  It isn't documented properly, it gets 
> set both by the core and by the HCD under various circumstances, and 
> there aren't any clear guidelines on how to use it.
> 
> Basically it should work like this:  Whenever you reset or stop the
> controller, set the state to HC_STATE_HALT.  Whenever you start the
> root hub, set the state to HC_STATE_RUNNING.  usbcore will set the
> state to HC_QUIESCING, HC_SUSPENDED, and HC_RESUMING at the appropriate
> times.

Ok, it looks like I've been using HC_STATE_RUNNING correctly.  I set it
before I enable IRQs for the host controller, since the xHC can send a
port status change event as soon as interrupts are enabled.

> It's important that the state _not_ be set to HC_STATE_HALT at times 
> when the controller might interrupt the CPU.

Gotcha.  So it's probably safe to set HC_STATE_HALT when the interrupt
handler or urb_dequeue function detects the PCI card has been removed.
If it comes back, it will be initialized as a different bus, correct?

> > > Routines involved in unlinking and giving back URBs check the private 
> > > flag.  If it is set then they bypass the hardware and carry out the 
> > > software part of their job immediately.
> > 
> > The xHCI routines to give back URBs are only called when the hardware
> > interrupts with an event on the event ring.  xHCI doesn't periodically
> > scan the frame list like EHCI.  If the timer runs, it's going to have to
> > give back the URBs itself.
> 
> The timer routine can call the same subroutines as the interrupt
> handler, to avoid duplicating code.

It's not really possible to share much code there.  The routines may go
through different TD lists, and it looked rather ugly and even more
confusing when I tried to share that code.

> Alan Stern
> 
> PS: xHCI must handle split transactions somehow, since there's no other
> way to communicate with a USB-1.1 device behind a USB-2.0 hub.  Does
> the xHCI spec describe how the hardware schedules split transactions?  
> Does it obey all the scheduling rules in the USB-2.0 spec?

Yes, the xHCI hardware generates all the split transactions itself.  The
xHCI The xHCI spec refers to section 8.4.2 and 11.14 of the USB2 spec,
but doesn't place any requirements on how hardware is supposed to
generate split transactions.  There is a "split transaction error" that
the xHCI driver is supposed to handle like a stalled endpoint, but I
haven't put that in yet, and haven't seen that error on any hardware I'm
currently testing on.

Sarah
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html