Re: usbdevfs: BULK_CONTINUATION flag does not work with XHCI controller

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Jun 03, 2012 at 03:58:57PM +0200, Hans de Goede wrote:
> Hi Sarah,
> 
> 10 days ago I send the below mail to various lists, in the mean time
> Alan Stern has responded and he agrees it is likely an xhci driver
> (or hardware) issue. Since this breaks redirecting USB-2 mass-storage
> devices to a kvm vm when plugged into an USB-3 port, this is a rather
> important bug for us to get fixed.

Comments below.  I don't agree that this is an xHCI host controller
issue.

> Can you take a look at this please? As usual I'm happy to test any
> patches / run further tests.
> 
> Regards,
> 
> Hans
> 
> p.s.
> 
> An other bug report just arrived at the libusb mailinglist, which
> likely has the same underlying cause:
> http://libusb.org/ticket/133
> 
> 
> ###
> 
> Hi,
> 
> I spend the last 1.5 days debugging a problem with redirecting
> certain USB mass-storage devices to a Windows 7 vm.
> 
> This problem only happens when the device is plugged into an
> USB 3 port.
> 
> I believe I now know what is happening:
> 
> First the working sequence:
> 1) Windows guest writes some bulk data to the msd
> 2) Windows guest tries to read up to 20480 bytes, only reads 36 bytes
> 3) Windows guest tries to read up to 512 bytes, only reads 13 bytes
> 
> The non working sequence:
> 1) Windows guest writes some bulk data to the msd
> 2) Windows guest tries to read up to 20480 bytes, only reads 49 bytes
> 3) Windows guest tries to read up to 512 bytes -> timeout
> 
> Here is what I believe is happening, at 2 libusb splits the
> bulk in transfer into 2 packets, 1 of 16k and 1 of 4k, setting the
> USBDEVFS_URB_SHORT_NOT_OK flag on both and the
> USBDEVFS_URB_BULK_CONTINUATION flag on the second.
> 
> With the EHCI controller all then works as intended, the first packet
> transfers 36 bytes and returns a status of EREMOTEIO, the second
> packet gets cancelled by drivers/usb/core/devio.c, transferring 0 bytes
> and returns a status of ECONNRESET and we all live happily ever after :)
> 
> With the XHCI controller however, the first packet transfers 36 bytes
> and returns a status of EREMOTEIO, as it should, but the second
> packet transfers the 13 next bytes and returns a status of ECONNRESET.

So you're fine with the second packet getting the ECONNRESET status
returned, but you don't want the 13 bytes transferred.  You want zero
bytes transferred.

> IOW the second packet of the split bulk transfer reads more data
> despite the first packet returning an error status and the
> USBDEVFS_URB_BULK_CONTINUATION flag being set.

Why doesn't the upper layer (devio.c) just submit the second packet only
after the first one completes without a short packet?  I'm confused
about the flow of buffers and URBs between userspace and kernel.  Please
provide a more detailed example.

> Thus the second packets has consumed the data the device had ready
> which should have been read by the next bulk transfer from the guest
> pov.
> 
> All in all it seems that the cancel of further packets done by
> drivers/usb/core/devio.c: cancel_bulk_urbs()
> comes too late when the device is on an XHCI controller, it seems that
> the controller is already "executing" the next bulk transfer *before*
> the completion handler of the previous one has completed.

Yes, I believe that's true.  If the packet is already queued on the
endpoint ring, the xHCI driver is going to start transferring that
packet as soon as the previous transfer completes.  In fact, it may take
some time to stop the endpoint rings for a completion, so you're
probably seeing that one part of the SCSI command (possibly the setup
phase?) finished transferring before the xHCI driver could cancel the
command.

It's really odd that this behavior worked under EHCI.  AFAICT, the
USBDEVFS_URB_BULK_CONTINUATION flag isn't even exposed to the host
controller (only to usbfs), so I don't see how canceling the bulk
command under EHCI could even work.

Unless usbfs is submitting one bulk transfer with multiple URBs in a
scatter gather list?  And then it's internally tracking the
USBDEVFS_URB_BULK_CONTINUATION flag, and attempting to cancel the rest
of the pending URBs in the completion handler when it gets a short
packet status?  Since the USB core is internally waiting on each URB to
complete in a scatter gather list before submitting the next one, that's
the only way I could see that this behavior could work.

If that's that usbfs is internally doing, I think that behavior only
works because you have a kernel that doesn't include scatter-gather
support in EHCI.  Once you get scatter gather support in the EHCI
driver, you will see the same behavior as xHCI.  I.e. all the URBs will
be queued to the EHCI host controller at once, and usbfs won't be able
to cancel the second packet before the transfer starts.

How in the world is the USBDEVFS_URB_BULK_CONTINUATION flag supposed to
work if only usbfs knows about it?  And why can't userspace wait on the
previous transfers to complete before submitting other URBs to usbfs
with that flag set?

To dig further into the heart of the issue, why isn't libusb submitting
large transfers in an iovec to a modified usbfs?  That's the interface
usbfs2 was supposed to add.  Or why can't users just migrate to new
kernels that have the total usbfs upper limit, and not the per packet
limit?

> Which leads to the conclusion that I believe that the
> BULK_CONTINUATION flag does not work properly with the XHCI controller.

I don't pay attention to that flag *at all* in the xHCI driver.  It's
*not exposed* to the host controller driver:

sarah@xanatos:~/git/kernels/xhci$ git grep USBDEVFS_URB_BULK_CONTINUATION
drivers/usb/core/devio.c:                               USBDEVFS_URB_BULK_CONTINUATION |
drivers/usb/core/devio.c:               if (uurb->flags & USBDEVFS_URB_BULK_CONTINUATION)
include/linux/usbdevice_fs.h:#define USBDEVFS_URB_BULK_CONTINUATION     0x04

It's the behavior of the upper layers that you need to change.  You
can't change anything in the xHCI driver to make the BULK_CONTINUATION
flag work.

> To verify my theory, I've tried raising the packet splitting limit in
> libusb to 32k and that (unsurprisingly) fixes things, but that is just
> a bandaid and not a proper fix for the issue at hand IMHO. Also
> bulk packets > 32k only work with the latest kernels.

Why not just use a later kernel?  That is the simpler solution.  As far
as I can tell, that would mean that some devices that need libusb are
just broken, but the majority of in-kernel drivers work fine.  I'm ok
with telling users to upgrade their kernel and use the higher packet
splitting limit in order to get their devices to work.

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux