On Thu, Oct 27, 2011 at 01:28:31PM +0200, Sarah Sharp wrote: > On Wed, Oct 26, 2011 at 12:05:00AM +1100, Matt wrote: > > On 25/10/2011 10:34 PM, Sarah Sharp wrote: > > >Does that show the failure? I don't see anything wrong in that > > >particular chunk of dmesg, just the USB device responding with a short > > >transfer, followed by the USB 2.0 bus getting suspended. I don't see > > >any of the failures in your previous log file like > > > > > >[ 1154.636273] xhci_hcd 0000:03:00.0: ERROR: unexpected command completion code 0x13. > > > > > >You might also want to apply the attached patch that removes debugging > > >about scatter gather lists (since it's really not useful to me and it's > > >filling up the log). > > > > > >Sarah Sharp > > > > I believe it did show the error however I've applied the patch and > > can provide further detail now. > > > > Firstly, this dmesg output should show from boot to when I connect > > the external drive chassis: http://pastebin.com/raw.php?i=EHd2t3ji > > > > Then, just to be sure that I don't scroll away any useful data, I've > > taken another capture of dmesg after I've attempted to run 'parted' > > and apply a GPT partition to one of the disks: > > http://pastebin.com/raw.php?i=jBTkVZrq > > > > During this process, parted complains about the disk being inaccessible. > > The first bit is that your disk is splitting back READ errors for a lot > of sectors, so either your disk is going bad, or the USB to SATA > connector doesn't like the commands userspace is sending it. The > usb-storage driver eventually attempts to reset the device because of > the errors. > > Before the device is reset, the USB core tries to drop the endpoints > associated with that config. But the host controller's state for those > endpoints say that they are disabled. We end up not dropping those > endpoints, issuing a command to the host to basically do nothing, which > the host doesn't like. But we'll leak memory if we don't drop those > endpoints, so the behavior we really want is to drop the endpoints. > > So either the host controller is not tracking the endpoint state > properly (and we should just ignore the disabled state and try to remove > those endpoints before a reset), or something more fundamental is broken > somewhere in between the first log file and the second log file. > > I'll have to send you a further patch to debug when exactly the > endpoints get put into the disabled state. Bug me if you haven't seen > that patch by next Wednesday. Ok, the xHCI driver code already has a warning message about disabled endpoints that's split out when a driver attempts to submit an URB to a disabled endpoint: static int prepare_ring(struct xhci_hcd *xhci, struct xhci_ring *ep_ring, u32 ep_state, unsigned int num_trbs, gfp_t mem_flags) { /* Make sure the endpoint has been added to xHC schedule */ switch (ep_state) { case EP_STATE_DISABLED: /* * USB core changed config/interfaces without notifying us, * or hardware is reporting the wrong state. */ xhci_warn(xhci, "WARN urb submitted to disabled ep\n"); return -ENOENT; You might need to search further back in your logs between the time the device enumerated and when you started getting READ errors to see if that message is in your log. The xHCI driver will refuse to queue the URB if the endpoint is disabled, so it's entirely possible that the driver is causing the READ errors. Driver debug from before that message appeared will be useful. Sarah Sharp -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html