Re: [RFT 2/2] USB: Free bandwidth when usb_disable_device is called.

Sarah Sharp <sarah.a.sharp@xxxxxxxxxxxxxxx> · Mon, 6 Jun 2011 12:31:01 -0700

On Mon, Jun 06, 2011 at 01:49:36PM -0400, Alan Stern wrote:
> On Mon, 6 Jun 2011, Sarah Sharp wrote:
> 
> > On Mon, Jun 06, 2011 at 11:10:23AM -0400, Alan Stern wrote:
> > > How about just unconditionally calling usb_hcd_alloc_bandwidth() before 
> > > looping through the endpoint arrays?
> > 
> > I don't think that's a good idea, because there could be URBs pending
> > before the loop calling usb_disable_endpoint() completes.  The bandwidth
> > allocation functions assume that all URB have been canceled or completed
> > before an endpoint is dropped.  Otherwise the command to remove the
> > endpoint rings can race with transfer completion.
> 
> And xhci-hcd isn't prepared to handle those races ... ?

Nope.  The driver has been told to issue a request to the hardware to
remove an endpoint and its associated ring.  What is the driver supposed
to do if URBs are pending on that ring?  It should be up to the USB core
to cancel those.

I have always assumed that when a driver requests a device reset, a
change of configuration or alternate interface setting, that it has
already either canceled active URBs or completed all the URBs for the
endpoints that are going away.  Otherwise the device could see a packet
bound for an endpoint that isn't part of the current interface.  If
those race conditions do exist, then we need to revisit the
implementation, but it seems like only a broken driver would leave URBs
pending.

> > > Alternatively, how about making 
> > > xhci-hcd's endpoint_disable method responsible for releasing the 
> > > endpoint's bandwidth reservation?
> > 
> > The problem is that the USB core calls the endpoint_disable method too
> > often, for all endpoints, without checking whether the endpoints were
> > actually active.
> 
> It does?  The only place I see it getting called is from within
> usb_disable_endpoint() when the endpoint array entry exists and
> reset_hardware is True.

I think the issue was that usb_disable_endpoint() and
usb_enable_endpoint() are only called after the control transfer to the
device to change the configuration or alternate interface setting
succeeds.  The bandwidth checking needs to be done before the USB core
issues that control transfer.

I also I think I didn't want to use those functions because I wanted to
batch all the changes to the endpoints associated with an alternate
interface setting change or a configuration change into one command to
issue to the xHCI host.

If I used those functions, the xHCI driver would have to issue a
Configure Endpoint command for each changed endpoint.  Then we have the
race condition where we could drop one endpoint for device 1, another
endpoint for a device 2 could be added, and then the add of a new
endpoint for device 1 would fail.  But we might not be able to re-add
the dropped endpoint for device 1 because suddenly we're out of bus
bandwidth.  At that point, the xHCI host thinks device 1 has no
endpoints active at all, and there's a mismatch between what the USB
core thinks the active endpoints are.

The code to drop old endpoints and add new ones really needs to be done
atomically in the functions that attempt to set up the new configuration
or alternate interface settings.  The code in those functions might look
ad-hock, but it's because I'm trying to satisfy the atomicity
requirements and the requirement to change the xHCI bandwidth structures
before the control transfer is issued.

> >  The xHCI driver doesn't keep track of which endpoints
> > are added; it expects the USB core to do so.  If the USB core
> > unconditionally drops all endpoints, the command to drop them will be
> > rejected by the host controller.
> > 
> > Besides, I'm not too keen on re-writing the whole bandwidth checking
> > code just for this special case. :)  Any other thoughts?  Is there
> > anything technically wrong with the patch, or it just looks ugly?
> 
> Well, it does look ugly.  At least you could change it to look like 
> this:
> 
>  	dev_dbg(&dev->dev, "%s nuking %s URBs\n", __func__,
>  		skip_ep0 ? "non-ep0" : "all");
> +
> +	if (hcd->driver->check_bandwidth) {
> +		/* First pass: Cancel URBs, leave endpoint pointers intact. */
> +		for (i = skip_ep0; i < 16; ++i) {
> +			usb_disable_endpoint(dev, i, false);
> +			usb_disable_endpoint(dev, i + USB_DIR_IN, false);
> +		}
> +		/* Remove endpoints from the host controller internal state */
> +		usb_hcd_alloc_bandwidth(dev, NULL, NULL, NULL);
> +	}
> +
>  	for (i = skip_ep0; i < 16; ++i) {
>  		usb_disable_endpoint(dev, i, true);
>  		usb_disable_endpoint(dev, i + USB_DIR_IN, true);

Yes, that does look a little better.

> You could even remove the "if" wrapper on the theory that always doing
> everything twice won't hurt.

I was just trying to minimize the changes that EHCI/UHCI/OHCI would see,
but if you think calling usb_disable_endpoint twice is harmless, I can
remove the if wrapper.

> Another alternative is to change usb_hcd_alloc_bandwidth().  It starts
> with two copies of the code that drops the endpoints in the current
> config, using the ep_in and ep_out arrays.  You could replace that with
> a single copy that loops through the interface and endpoint descriptors
> in the current config instead.  Then it wouldn't matter if the arrays
> are zeroed before the bandwidth computation.

I think there was some issue that forced me to use the endpoint arrays.
Maybe it was something like the data structures that are accessible from
usb_hcd_alloc_bandwidth() don't allow me to access the
usb_host_endpoint?  I do remember struggling with that code.

> In the end, it would be really nice to make all this stuff totally
> rational.  Right now it's kind of an ad-hoc mess.  We should go through
> a unified procedure whenever updating an altsetting or config:
> 
> 	Disable and flush the current endpoints;
> 
> 	Remove the endpoints' hardware resources (including bandwidth);
> 
> 	Allocate hardware resources for the new endpoints;
> 
> 	Enable the new endpoints.

Mmm, but the removing of old endpoint bandwidth resources has to happen
atomically with adding new endpoint bandwidth resources.

> We should also try to come up with a good strategy for handling errors. 
> It's too bad that xHCI doesn't provide any way to check a new bandwidth 
> setting without disabling the current endpoints.

Yeah, the driver could just attempt to add the new endpoints, but if any
of the endpoint addresses overlap, the host controller is going to
reject them, as Tanya has found out.  So we really need to drop the old
endpoints before adding new ones.

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html