Re: [RFC 8/9] UAS: Cancel pending URBs and free streams in pre-reset.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2 Dec 2011, Sarah Sharp wrote:

> > Bear in mind that pre_reset isn't necessarily trying to flush any 
> > commands.  It just gets called from time to time.  Maybe because the 
> > device is stuck and needs to be reset; maybe because some other driver 
> > for a different interface on the same device needs to reset it.
> >
> > You do need to decide what to do about commands that are in flight.  
> > Ideally you would wait until they all complete by themselves (or are
> > aborted by the error handler, which amounts to the same thing).  
> > Actively cancelling their URBs in preparation for the reset is not as
> > safe, although it is necessary when things time out or aborts fail.
> 
> It's true that another interface driver could want to reset the USB
> device, not just the UAS driver.  But is there a way to tell the
> difference without adding code to the UAS driver?

There _is_ no real difference.  For example, it's entirely possible
that another interface's driver could decide to reset the device at the
same time as the UAS driver.

On the other hand, it's not a big deal to add a single flag meaning "I 
have decided to reset the device".

> If the UAS driver is trying to reset the device, then it probably makes
> sense to flush/kill everything rather than simply wait for it to
> complete.  At that point, the device will have already refused to
> respond to an abort task IU for a queued SCSI command.  The devices I've
> seen that need a reset just seem wedged, and no other commands complete
> on other stream IDs.
> 
> I'm also not sure we can just wait for the commands to complete.  Since
> the device chooses which streams to request data for (and thus which
> SCSI commands to service), it's entirely possible the device could just
> starve a particular command indefinitely due to some fireware bug.  So
> another driver asks for a reset, we let the commands complete on their
> own (while not submitting new commands), the starved SCSI command
> eventually times out and the abort task fails, and then the UAS driver
> asks for a reset.  What happens then?

In that case the other driver would have to wait for the timeout.  You
could improve the situation a little when this occurs by giving the
existing commands only a short time in which to complete, after which
you try to abort them.  The abort would also need to have a short
timeout.

> > > The solution is to introduce a new atomic_t, resetting.  When the
> > 
> > Why atomic_t?  Is the flag going to be written from multiple threads or
> > contexts?
> 
> I think I used atomic_t because it has the memory barrier semantics I
> need to make SRCU work.  I'll have to dig up my notes on that.

I don't think atomic_t values have any memory barrier semantics in 
particular.  Some of the functions that operate on atomic_t's do have 
memory barriers, though.

Don't the SRCU primitives already provide all the memory barriers you 
need?  They are supposed to.

> > > resetting flag is set, uas_queuecommand_lck() should tell the SCSI layer
> > > that the device is busy for any new commands, uas_stat_cmplt() will tell
> > > the SCSI layer the command aborted without submitting the data URBs, and
> > 
> > This is confusing.  What you want to do is plug your queue.  That is, 
> > stop sending out new commands, while waiting for the existing ones to 
> > complete.  Is that what you meant?  (I'm not sure exactly what 
> > uas_stat_cmplt's job is.)
> 
> With USB 2.0 devices, we submit one pair of URBs (the command phase and
> status phase), and then submit a second pair of URBs (the data phase and
> status phase) in the first status phase's URB completion handler
> (uas_stat_cmplt).  If the device is about to be reset because the UAS
> driver requested it, I'm not sure it makes sense to submit the second
> round of URBs.  I suppose we could let them go through if a different
> driver was requesting the reset, but we'd need a flag or something
> similar to signal the UAS driver was the requester.

I guess this depends on your philosophy: When a reset is coming up, do
you want to allow as many outstanding commands as possible to complete
normally?  Or do you prefer to abort them all immediately?

If you do decide to cancel them, then yes, uas_stat_cmplt requires some
sort of synchronization.  But it has to be synchronized with the
cancellation operation, not with the reset.  For example, what would
happen if the driver were asked to cancel a command just as
uas_stat_cmplt was called?

> Can usb_kill_anchored_urb() fail?

No more than any unlink operation can.  That's basically all it does.

>  If the ABORT TASK IU fails to get the
> command out of the device command queue, we simply kill the URBs associated
> with that command.  If usb_kill_anchored_urb() can't fail, we'll be
> fine.  (I think it could only fail if the xHCI host controller failed to
> complete the stop endpoint command, at which point the host controller
> is just hosed and we have bigger issues.)

Right.

> > > We use SRCU to synchronize between the readers of the resetting flag
> > > (uas_queuecommand_lck, uas_xfer_data, and uas_eh_abort_handler) and the
> > > writers of the flag (uas_pre_reset and uas_post_reset).  We have to use
> > > "sleepable" RCU instead of RCU because the readers must be able to sleep
> > > while waiting for URBs to be killed, while holding the reader lock.
> > 
> > It's not clear that this is the best approach.  You already own the
> > host lock in uas_queuecommand_lck; why not use that for
> > synchronization?
> 
> The host lock isn't taken in the UAS workqueue (which attempts to
> resubmit URBs for SCSI commands that initially failed due to out of
> memory or errors from usb_submit_urb).

Yeah, okay, that sounds like it may be a reasonable use for SRCU.

What happens when you decide to cancel a command that has some URBs 
sitting on the workqueue's queue?  How do you prevent the workqueue 
from submitting them?

>  It also isn't taken in the
> status phase handler (uas_stat_cmplt) that resubmits the second pair of
> URBs for USB 2.0 UAS devices.  I'm not sure if the SCSI host lock can be
> taken in interrupt context?

It can.  However, from the comments above, it's not clear that 
uas_stat_cmplt needs to do anything special.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux