Hi Sarah On Thu, Sep 13, 2012 at 2:37 AM, Sarah Sharp <sarah.a.sharp@xxxxxxxxxxxxxxx> wrote: > On Wed, Sep 12, 2012 at 03:36:39PM +0530, Ajay Gupta wrote: >> Hi, >> >> I am using v3.5 kernel and running a test where I disconnect a SS MSC >> device while a big file is being written to it. I am also watching the >> lsusb output in parallel. Expectation is that SS MSC device should >> immediately disappear from lsusb output but sometime I see that SS MSC >> is listed for 30 seconds and then only disappears. Log is copied >> below. >> >> Looking further into the issue shows that in failing case an URB is >> not given back by XHCI driver and so SCSI layer unlinks it after >> 30second of timeout. This URB was actually given to XHCI host >> controller and a disconnect happens while packet transfer was in >> progress. I looked at XHCI driver and could not find request being >> aborted by driver (in xhci_free_dev() fn) before issuing disable slot >> or reset device command. What is expectation in this scenario? > > The expectation is that the mass storage driver should have aborted any > URBs in its disconnect function before xhci_free_dev() is called. I > believe all USB drivers are required to do this. If that assumption > isn't true, then, yes, I need to revisit any place that xHCI rings get > freed. I think your assumption is correct and same has been discussed by Alan at http://marc.info/?l=linux-usb&m=134382746009339&w=2could The 30 seconds delay seems to be expected linux storage behavior as per discussion in above thread. I tested again with little modified version of code (copied below) suggested in above link and don't see any issue. I understand that below code is a pure hack and *not* a solution. ----------------------------------------------------------------------------------------- --- a/usb.c +++ b/usb.c @@ -834,6 +834,8 @@ static void quiesce_and_remove_host(struct us_data *us) if (test_bit(US_FLIDX_SCAN_PENDING, &us->dflags)) usb_autopm_put_interface_no_suspend(us->pusb_intf); + /* stop the current urbs when the device got disconnected */ + usb_stor_stop_transport(us); /* Removing the host will perform an orderly shutdown: caches * synchronized, disks spun down, etc. */ ----------------------------------------------------------------------------------------- Alan suggested to add logic inside host controller driver to unlink all URBs at disconnect. This is required as either SCSI or storage layer doesn't know if device is unplugged or unbounded. Another solution could be to add some kind of flag notifying to SCSI layer of device disconnect or unbound and so SCSI layer shouldn't wait for 30 seconds if it's disconnect. >> Is it >> XHCI driver responsibility to abort and giveback all the request (as >> in EHCI using endpoint_disable()) >> Why XHCI >> driver doesn't have an endpoint_disable() implemented as done for >> other controller? > > The xHCI driver just works differently than EHCI. It's not a > requirement to implement the endpoint disable function, AFAIK. From > what I remember, the endpoint disable function was supposed to be used > before switching alternate interface settings, but it was called too > late to be of much use to the xHCI bandwidth functions. So I created > new bandwidth functions and ignored the endpoint disable functions. > Maybe Alan can explain what the requirements about the endpoint disable > function are? endpoint_disbale is also called from usbcore layer in disconnect path. > >> OR host controller hardware should >> abort and post an error completion event for each request? > > If the host controller hardware was in the middle of a transfer when the > device was disconnected, and that caused the transfer to timeout, then, > yes, the host should return an event with a transfer error and halt the > event ring, as described in the xHCI 1.0 spec, section 4.10.2.3. Even > if the host wasn't strictly speaking in the middle of a transfer during > the disconnect, the endpoint ring should remain on the host's schedule > as long as the xHCI driver didn't issue a stop endpoint command. I > would expect that the transfer would timeout the next time it was > executed from the schedule. > > Basically, the host controller has to play dumb and hand back events > with error statuses for all TDs that continue to be schedulable after a > device disconnect. Eventually the USB core will notice the disconnect, > notify the driver, and it will cancel any outstanding URBs. > >> =============== Working case ==================== [...] >> dma), new deq ptr = f37ad070 (0x337ad070 dma), new cycle = 0 > > According to that, the rings are still in tact and we can issue a stop > endpoint ring command successfully. Are you sure xhci_free_dev() was > called before the URB was canceled? Have you tried adding some printk > warnings in xhci_free_dev() if the td_list for any endpoint is not > empty? I tested again and found that xhci_free_dev() in is indeed called after URB is cancelled post 30 second timeout and so td_list is empty at the time of issuing disable_slot cmd. > > We can certainly change the driver to stop the endpoints and give back > any URBs in xhci_free_dev(), but I want to make sure we understand > what's actually going on in the scenario you describe. We may want to change driver to avoid 30 seconds timeout as Alan suggested in another thread. Ajay > > Sarah Sharp -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html