On Fri, 8 Mar 2013, Sarah Sharp wrote: > On Fri, Mar 08, 2013 at 11:14:38AM -0500, Alan Stern wrote: > > On Thu, 7 Mar 2013, Sarah Sharp wrote: > > > > > "One Ring to rule them all, One Ring to find them, > > > One Ring to bring them all and in the darkness bind them" > > > - J. R. R. Tolkien > > > > > > There is only one command ring for each xHCI host, and all commands flow > > > through that ring. However, when a tricksy little USB device fails to > > > respond to the Set Address command, the entire ring hangs. > > > > > > If a Stop Endpoint command had been queued at the same time, the > > > watchdog timer will fire shortly after we start aborting the Set Address > > > command. The watchdog timer will assume the host is dying and kill it, > > > disconnecting all USB devices under the host. It's basically a USB DoS > > > attack. > > > > > > The fix is for all commands to check whether they are the currently > > > executing command on the command ring when they timeout. If not, they > > > will wait on the command completion again, and the Stop Endpoint > > > watchdog timer will re-queue itself. If they are the currently > > > executing command, they will cancel themselves, or the Stop Endpoint > > > watchdog timer will assume the host is dying. > > > > You might want to go farther than this. I haven't read the driver code > > enough to be sure how you have this organized, but it seems logical > > that you shouldn't set a command's expiration time until that command > > begins executing. > > > > Suppose the ring contains commands A and B and you set B's expiration > > time while A is still running. Then A could continue running until B's > > time is almost up. When B does start, the watchdog timer will expire > > before B has a chance to do anything. > > Yes, it's true. With this patch, the watchdog timer will re-queue > itself, of course. No; it will see that B is the currently executing command and either cancel B or conclude that the host is dead. Just like the behavior without this patch, except that the window of opportunity is smaller. > I do agree that it's inefficient to have multiple > timers running when we could only have one timer running. > > > By not setting a command's expiration time until that command begins > > running, you remove the need for the check added by this patch. It > > will not be possible for the command's watchdog timer to expire before > > the command becomes the one currently executing. > > In general, all the commands wait (interruptibly) on a completion, with > a 5 second timeout. However, the Stop Endpoint command doesn't have a > completion, since it's part of the asynchronous call to cancel an URB. > The Stop Endpoint command is the only one with a watchdog timer. > > I started out with an idea to yours, of having a global command list. > Commands would be queued to the FIFO, submitted to the command ring, and > then their submitters would call wait_for_completion_interruptible > (without a timeout), or simply return, in the case of the Stop Endpoint > command. The watchdog timer routine would be gutted to be a global > command queue manager that timed the currently executing command, and > would cancel it if necessary. > > In the end, I decided it would be better for the stable kernels to keep > to the simpler (if less efficient) solution. But I would like to do the > "right thing" later on. Would it be simpler to keep the existing implementation, but start the timer for a Stop Endpoint command in the completion handler for the previous command? (But then what happens if there are two Stop Endpoint commands in a row?) Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html