On Mon, Jun 06, 2011 at 01:49:36PM -0400, Alan Stern wrote: > On Mon, 6 Jun 2011, Sarah Sharp wrote: > > > On Mon, Jun 06, 2011 at 11:10:23AM -0400, Alan Stern wrote: > > > How about just unconditionally calling usb_hcd_alloc_bandwidth() before > > > looping through the endpoint arrays? > > > > I don't think that's a good idea, because there could be URBs pending > > before the loop calling usb_disable_endpoint() completes. The bandwidth > > allocation functions assume that all URB have been canceled or completed > > before an endpoint is dropped. Otherwise the command to remove the > > endpoint rings can race with transfer completion. > > And xhci-hcd isn't prepared to handle those races ... ? Nope. The driver has been told to issue a request to the hardware to remove an endpoint and its associated ring. What is the driver supposed to do if URBs are pending on that ring? It should be up to the USB core to cancel those. I have always assumed that when a driver requests a device reset, a change of configuration or alternate interface setting, that it has already either canceled active URBs or completed all the URBs for the endpoints that are going away. Otherwise the device could see a packet bound for an endpoint that isn't part of the current interface. If those race conditions do exist, then we need to revisit the implementation, but it seems like only a broken driver would leave URBs pending. > > > Alternatively, how about making > > > xhci-hcd's endpoint_disable method responsible for releasing the > > > endpoint's bandwidth reservation? > > > > The problem is that the USB core calls the endpoint_disable method too > > often, for all endpoints, without checking whether the endpoints were > > actually active. > > It does? The only place I see it getting called is from within > usb_disable_endpoint() when the endpoint array entry exists and > reset_hardware is True. I think the issue was that usb_disable_endpoint() and usb_enable_endpoint() are only called after the control transfer to the device to change the configuration or alternate interface setting succeeds. The bandwidth checking needs to be done before the USB core issues that control transfer. I also I think I didn't want to use those functions because I wanted to batch all the changes to the endpoints associated with an alternate interface setting change or a configuration change into one command to issue to the xHCI host. If I used those functions, the xHCI driver would have to issue a Configure Endpoint command for each changed endpoint. Then we have the race condition where we could drop one endpoint for device 1, another endpoint for a device 2 could be added, and then the add of a new endpoint for device 1 would fail. But we might not be able to re-add the dropped endpoint for device 1 because suddenly we're out of bus bandwidth. At that point, the xHCI host thinks device 1 has no endpoints active at all, and there's a mismatch between what the USB core thinks the active endpoints are. The code to drop old endpoints and add new ones really needs to be done atomically in the functions that attempt to set up the new configuration or alternate interface settings. The code in those functions might look ad-hock, but it's because I'm trying to satisfy the atomicity requirements and the requirement to change the xHCI bandwidth structures before the control transfer is issued. > > The xHCI driver doesn't keep track of which endpoints > > are added; it expects the USB core to do so. If the USB core > > unconditionally drops all endpoints, the command to drop them will be > > rejected by the host controller. > > > > Besides, I'm not too keen on re-writing the whole bandwidth checking > > code just for this special case. :) Any other thoughts? Is there > > anything technically wrong with the patch, or it just looks ugly? > > Well, it does look ugly. At least you could change it to look like > this: > > dev_dbg(&dev->dev, "%s nuking %s URBs\n", __func__, > skip_ep0 ? "non-ep0" : "all"); > + > + if (hcd->driver->check_bandwidth) { > + /* First pass: Cancel URBs, leave endpoint pointers intact. */ > + for (i = skip_ep0; i < 16; ++i) { > + usb_disable_endpoint(dev, i, false); > + usb_disable_endpoint(dev, i + USB_DIR_IN, false); > + } > + /* Remove endpoints from the host controller internal state */ > + usb_hcd_alloc_bandwidth(dev, NULL, NULL, NULL); > + } > + > for (i = skip_ep0; i < 16; ++i) { > usb_disable_endpoint(dev, i, true); > usb_disable_endpoint(dev, i + USB_DIR_IN, true); Yes, that does look a little better. > You could even remove the "if" wrapper on the theory that always doing > everything twice won't hurt. I was just trying to minimize the changes that EHCI/UHCI/OHCI would see, but if you think calling usb_disable_endpoint twice is harmless, I can remove the if wrapper. > Another alternative is to change usb_hcd_alloc_bandwidth(). It starts > with two copies of the code that drops the endpoints in the current > config, using the ep_in and ep_out arrays. You could replace that with > a single copy that loops through the interface and endpoint descriptors > in the current config instead. Then it wouldn't matter if the arrays > are zeroed before the bandwidth computation. I think there was some issue that forced me to use the endpoint arrays. Maybe it was something like the data structures that are accessible from usb_hcd_alloc_bandwidth() don't allow me to access the usb_host_endpoint? I do remember struggling with that code. > In the end, it would be really nice to make all this stuff totally > rational. Right now it's kind of an ad-hoc mess. We should go through > a unified procedure whenever updating an altsetting or config: > > Disable and flush the current endpoints; > > Remove the endpoints' hardware resources (including bandwidth); > > Allocate hardware resources for the new endpoints; > > Enable the new endpoints. Mmm, but the removing of old endpoint bandwidth resources has to happen atomically with adding new endpoint bandwidth resources. > We should also try to come up with a good strategy for handling errors. > It's too bad that xHCI doesn't provide any way to check a new bandwidth > setting without disabling the current endpoints. Yeah, the driver could just attempt to add the new endpoints, but if any of the endpoint addresses overlap, the host controller is going to reject them, as Tanya has found out. So we really need to drop the old endpoints before adding new ones. Sarah Sharp -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html