On Wed, 22 Jul 2009, David Brownell wrote: > On Tuesday 21 July 2009, Alan Stern wrote: > > Dave: > > > > It seems there are some errors in the way ehci-hcd handles > > QH_STATE_COMPLETING. They are largely due to the fact that COMPLETING > > isn't a real state -- URBs can be given back either while a QH is > > linked or while it is unlinked (or even while it is in transition). > > It's more of a transient lock, making it so only qh_completions() > may modify the QH even if ehci->lock was dropped ... because it's > got to drop that lock while giving URBs back through usbcore. And yet if the completion routine submits or unlinks more URBs, we _will_ have to modify the QH. There's no need to worry about the QH getting unlinked while an URB is being given back. A comment in scan_async() notes the possibility and the code is prepared for it. The real problem (or at least, part of the problem) is that the QH's true state is obscured. If other code sees QH_STATE_COMPLETING then it can't tell whether the QH is really linked, unlinked, or in between. > I do have a note in my patches to revisit the case of unlink_async() > seeing a QH in that state, which is one transition you ask about. Right. Since unlink_async() doesn't know whether the QH is truly linked or not, it doesn't know what to do. > > For example, suppose the completion routine for a bulk or control URB > > unlinks a later URB queued for the same endpoint. Then > > ehci_urb_dequeue will see that qh->qh_state is QH_STATE_COMPLETING, so > > it will call unlink_async. But when unlink_async sees that the state > > isn't QH_STATE_LINKED, it will return almost immediately without doing > > anything. Thus qh never does get unlinked. > > Maybe it does -- if the URB completed naturally, or the QH is either > unlinked or halted *and* qh_completions() hasn't already scanned that > particular URB. But maybe it doesn't. > > What's needed is a way to *force* the qh_completions() logic to either > (a) start a QH unlink, if it would restore to QH_STATE_LINKED and the > QH wasn't halted, else (b) do an extra scan, in case it already looked > at that URB on that scan. "(b) do an extra scan" sounds good. These cases I'm concerned about are a little exceptional, so the overhead won't matter. > > Finally, qh_completions can be called from several different pathways. > > It wouldn't be surprising to see it get called recursively on occasion. > > I'd be surprised. :) > > If that happens, it's a bug. ehci_urb_dequeue() will call qh_completions() when an interrupt URB is unlinked. While that URB is being given back, an IRQ might occur and scan_periodic() might then call qh_completions() for the same endpoint. > > And since it doesn't test for QH_STATE_COMPLETING at the start, it > > would run recursively, thereby messing up the list_for_each_safe() loop > > through the qTDs. > > ... and potentially trashing the QH context, which can be a bit > fragile. If you're concerned about that, just check and warn. > > > > Maybe instead of using QH_STATE_COMPLETING, a QH should contain a flag > > to indicate that its queue is being scanned and another flag to > > indicate that a rescan will be needed. What do you think? > > That could work. Certainly the "full scan needed" flag is needed > to address the control/bulk issue. And it might help with cleaner > code to unlink interrupt transfers. I'd leave the pseudo-state in > place for now though, According to grep, the only place we look for COMPLETING is in the control/bulk case of ehci_urb_dequeue(). That needs to be changed anyway. Of course, there may be other, implicit uses -- i.e., where the code checks for other values and the comparisons fail because the value is COMPLETING. There are a few places where that can happen, such as in unlink_async() and ehci_endpoint_reset(). As far as I can tell, they are all buggy. So I think it would be better to do away with QH_STATE_COMPLETING entirely. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html