On Sun, 22 Apr 2012, Ming Lei wrote: > On Sun, Apr 22, 2012 at 8:50 PM, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > > On Sun, 22 Apr 2012, Ming Lei wrote: > > > >> > Although the kerneldoc doesn't actually say so, it should be safe to > >> > assume that usb_unlink_urb calls the completion routine directly _only_ > >> > in cases where the unlink succeeded. �(We could add this to the > >> > kerneldoc.) > >> > > >> > Therefore: If the URB completes with status other than -ECONNRESET then > >> > you can safely take the lock for resubmission. �If the URB completes > >> > with status == -ECONNRESET then you know it was unlinked, so you don't > >> > need to take the lock -- the race has already been lost. > >> > > >> > Does that solve your problem? > >> > >> Not sure if that does work. > >> > >> If the URB completes asynchronously after unlinking, its status is still > >> �-ECONNRESET, so extra race may be caused without holding the lock > >> because complete handler will access some global data. > > > > That would be a completely separate race, right? �So maybe it can use a > > Not sure, at least in both usbnet and usbhid cases, the lock is held before > usb_unlink_urb, and the same lock is to be acquired in the URB complete > handler. > > > different lock for protection -- and this other lock could be dropped > > before usb_unlink_urb is called. > > If the lock which is to be acquired in the URB complete handler is dropped > before calling usb_unlink_urb, one new submitted URB in complete handler > may be unlinked, as mentioned by Oliver already. We are now talking about two locks. One of them is held during the call to usb_unlink_urb; the completion handler does not acquire that lock if the URB's status is -ECONNRESET. The other lock is dropped before usb_unlink_urb is called, so the completion handler can safely grab it. On Mon, 23 Apr 2012, Oliver Neukum wrote: > > If the URB completes asynchronously after unlinking, its status is still > > -ECONNRESET, so extra race may be caused without holding the lock > > because complete handler will access some global data. > > That is the race. And you need not invoke global data. The original > race opens again if you are submitting a new URB without the lock > held. > This is because we cannot be sure that the same URB is unlinked > only once. A subsequent timeout may kill the wrong URB if the > first is unlinked so that the callback really comes in interrupt. > > But the basic idea is brilliant. It's just that the one way logical implication: > recursive direct call of the callback -> status == -ECONNRESET > is not strong enough. But that is very easy to fix. As we know whether > the callback is directly called or not, all we need to do is differentiate > the cases in urb->status, by introducing a new error code. I don't like the idea of changing the status codes. It would mean changing usb_kill_urb too. Instead of changing return codes or adding locks, how about implementing a small state machine for each URB? Initially the state is ACTIVE. When the URB times out, acquire the lock. If the state is not equal to ACTIVE, drop the lock and return immediately (the URB is being unlinked concurrently). Otherwise set the state to UNLINK_STARTED, drop the lock, call usb_unlink_urb, and reacquire the lock. If the state hasn't changed, set it back to ACTIVE. But if the state has changed to UNLINK_FINISHED, set it to ACTIVE and resubmit. In the completion handler, grab the lock. If the state is ACTIVE, resubmit. But if the state is UNLINK_STARTED, change it to UNLINK_FINISHED and don't resubmit. This is a better approach, in that it doesn't make any assumptions regarding synchronous vs. asynchronous unlinks. If you want, you could have two different ACTIVE substates, one for URBs which haven't yet been unlinked and one for URBs which have been. Then you could avoid unlinking the same URB twice. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html