Re: usbdevfs: BULK_CONTINUATION flag does not work with XHCI controller

Hans de Goede <hdegoede@xxxxxxxxxx> · Mon, 04 Jun 2012 09:59:08 +0200

Hi,

On 06/04/2012 06:13 AM, Sarah Sharp wrote:
On Sun, Jun 03, 2012 at 03:58:57PM +0200, Hans de Goede wrote:
Hi Sarah,

10 days ago I send the below mail to various lists, in the mean time
Alan Stern has responded and he agrees it is likely an xhci driver
(or hardware) issue. Since this breaks redirecting USB-2 mass-storage
devices to a kvm vm when plugged into an USB-3 port, this is a rather
important bug for us to get fixed.

Comments below.  I don't agree that this is an xHCI host controller
issue.

Thanks for the detailed answer.

Can you take a look at this please? As usual I'm happy to test any
patches / run further tests.

Regards,

Hans

p.s.

An other bug report just arrived at the libusb mailinglist, which
likely has the same underlying cause:
http://libusb.org/ticket/133

###

Hi,

I spend the last 1.5 days debugging a problem with redirecting
certain USB mass-storage devices to a Windows 7 vm.

This problem only happens when the device is plugged into an
USB 3 port.

I believe I now know what is happening:

First the working sequence:
1) Windows guest writes some bulk data to the msd
2) Windows guest tries to read up to 20480 bytes, only reads 36 bytes
3) Windows guest tries to read up to 512 bytes, only reads 13 bytes

The non working sequence:
1) Windows guest writes some bulk data to the msd
2) Windows guest tries to read up to 20480 bytes, only reads 49 bytes
3) Windows guest tries to read up to 512 bytes ->  timeout

Here is what I believe is happening, at 2 libusb splits the
bulk in transfer into 2 packets, 1 of 16k and 1 of 4k, setting the
USBDEVFS_URB_SHORT_NOT_OK flag on both and the
USBDEVFS_URB_BULK_CONTINUATION flag on the second.

With the EHCI controller all then works as intended, the first packet
transfers 36 bytes and returns a status of EREMOTEIO, the second
packet gets cancelled by drivers/usb/core/devio.c, transferring 0 bytes
and returns a status of ECONNRESET and we all live happily ever after :)

With the XHCI controller however, the first packet transfers 36 bytes
and returns a status of EREMOTEIO, as it should, but the second
packet transfers the 13 next bytes and returns a status of ECONNRESET.

So you're fine with the second packet getting the ECONNRESET status
returned, but you don't want the 13 bytes transferred.  You want zero
bytes transferred.

Correct.

IOW the second packet of the split bulk transfer reads more data
despite the first packet returning an error status and the
USBDEVFS_URB_BULK_CONTINUATION flag being set.

Why doesn't the upper layer (devio.c) just submit the second packet only
after the first one completes without a short packet?

For performance reasons if an app is doing lots of small bulk packets
(which happens more often then you would wish I'm afraid), then waiting for
one to complete before submitting the next one results in lots of USB bus
idle time.

I'm confused
about the flow of buffers and URBs between userspace and kernel.  Please
provide a more detailed example.

1) userspace libusb app submits a bulk in transfer which wants to read > 16k
2) libusb splits this in multiple urbs, setting the USBDEVFS_URB_SHORT_NOT_OK
   flag on all urbs and the  USBDEVFS_URB_BULK_CONTINUATION flag on all but the
   first.
3) devio.c processes these submitting them all to the controller, remembering the
   USBDEVFS_URB_BULK_CONTINUATION flag per urb
4) Device returns a short packet (less then requested on the first packet)
5) According to Alan Stern, with an EHCI controller, the controller halts the ep
   because of the USBDEVFS_URB_SHORT_NOT_OK flag + 4).
6) devio.c walk its list of pending urbs, cancelling all urbs for the same device
   + ep which have the USBDEVFS_URB_BULK_CONTINUATION flag set (iow stopping at the
   first urb for the same ep without the USBDEVFS_URB_BULK_CONTINUATION flag set.

If I understood Alan correctly he believes the XHCI controller not doing 5 is the
problem.

Looking at this again I think I've found a (related) bug in devio.c, if another
(not a continued, but a completely different) bulk transfer has already been
submitted to the ep, who / what is going to restart the ep, after the urbs which
belong to the short transfer have been canceled ?

Thus the second packets has consumed the data the device had ready
which should have been read by the next bulk transfer from the guest
pov.

All in all it seems that the cancel of further packets done by
drivers/usb/core/devio.c: cancel_bulk_urbs()
comes too late when the device is on an XHCI controller, it seems that
the controller is already "executing" the next bulk transfer *before*
the completion handler of the previous one has completed.

Yes, I believe that's true.  If the packet is already queued on the
endpoint ring, the xHCI driver is going to start transferring that
packet as soon as the previous transfer completes.

Right, so if I understood correctly, devio assumes the EP ring will stop
automatically on a short transfer when the urb has the
USBDEVFS_URB_SHORT_NOT_OK flag set.

In fact, it may take
some time to stop the endpoint rings for a completion, so you're
probably seeing that one part of the SCSI command (possibly the setup
phase?) finished transferring before the xHCI driver could cancel the
command.

Right, that is exactly what seems to be happening.

It's really odd that this behavior worked under EHCI.  AFAICT, the
USBDEVFS_URB_BULK_CONTINUATION flag isn't even exposed to the host
controller (only to usbfs), so I don't see how canceling the bulk
command under EHCI could even work.

devio assumes that the ep stops on the first short read if
USBDEVFS_URB_SHORT_NOT_OK is set, and for EHCI this seems to be the
case.

Unless usbfs is submitting one bulk transfer with multiple URBs in a
scatter gather list?  And then it's internally tracking the
USBDEVFS_URB_BULK_CONTINUATION flag, and attempting to cancel the rest
of the pending URBs in the completion handler when it gets a short
packet status?  Since the USB core is internally waiting on each URB to
complete in a scatter gather list before submitting the next one, that's
the only way I could see that this behavior could work.

No, the splitting into multiple urbs is happening in userspace
(it is done by libusb), not by devio.c

If that's that usbfs is internally doing, I think that behavior only
works because you have a kernel that doesn't include scatter-gather
support in EHCI.  Once you get scatter gather support in the EHCI
driver, you will see the same behavior as xHCI.  I.e. all the URBs will
be queued to the EHCI host controller at once, and usbfs won't be able
to cancel the second packet before the transfer starts.

How in the world is the USBDEVFS_URB_BULK_CONTINUATION flag supposed to
work if only usbfs knows about it?

Because devio assumes that the ep gets halted on errors, and the
USBDEVFS_URB_BULK_CONTINUATION flag only comes into action if previous
packets of the same muli-urb transfer fail in some way (in my specific
example an short read is an error because of the USBDEVFS_URB_SHORT_NOT_OK
flag).

And why can't userspace wait on the
previous transfers to complete before submitting other URBs to usbfs
with that flag set?

Performance reasons, having multiple urbs queued up significantly enhances
performance, esp. if userspace were to submit the urbs one by one the
performance would become quite bad, because there is no guarantee the
process will wake up immediately when a previous urb completes.

To dig further into the heart of the issue, why isn't libusb submitting
large transfers in an iovec to a modified usbfs?  That's the interface
usbfs2 was supposed to add.

I'm all for doing an usbfs2, esp. since we need some major changes to add
support for bulk streams anyways. But that is a completely different
discussion IMHO.

Or why can't users just migrate to new
kernels that have the total usbfs upper limit, and not the per packet
limit?

Because professional users often need support, which means using one of
the enterprise distros (RHEL-6 in this case), which don't have a new
enough kernel.

Also bulk transfers can be up to 4 MB (as a minimum I could not find
a hard upper limit for them) and at some point it starts making sense to
split them, ie maybe have part of the urbs for a 4 MB transfer queued up
and as some complete add more. Then as the first transfer nears completion
start queuing up some urbs for the next 4 MB transfer, etc. When doing
things like that from multiple apps at the same time you will soon
run into limitations even with the new kernel limits. So at some time
it will always start making sense to split the transfers, so we need
splitting up transfers to work.

More over the USBDEVFS_URB_BULK_CONTINUATION flag is part of our official
API and this should just work as advertised!

Which leads to the conclusion that I believe that the
BULK_CONTINUATION flag does not work properly with the XHCI controller.

I don't pay attention to that flag *at all* in the xHCI driver.  It's
*not exposed* to the host controller driver:

sarah@xanatos:~/git/kernels/xhci$ git grep USBDEVFS_URB_BULK_CONTINUATION
drivers/usb/core/devio.c:                               USBDEVFS_URB_BULK_CONTINUATION |
drivers/usb/core/devio.c:               if (uurb->flags&  USBDEVFS_URB_BULK_CONTINUATION)
include/linux/usbdevice_fs.h:#define USBDEVFS_URB_BULK_CONTINUATION     0x04

It's the behavior of the upper layers that you need to change.  You
can't change anything in the xHCI driver to make the BULK_CONTINUATION
flag work.

See above, if Alan is right that the ep ring should stop on an error,
and a short read is an error when the USBDEVFS_URB_SHORT_NOT_OK flag is set,
then this is possibly an xHCI driver issue.

To verify my theory, I've tried raising the packet splitting limit in
libusb to 32k and that (unsurprisingly) fixes things, but that is just
a bandaid and not a proper fix for the issue at hand IMHO. Also
bulk packets>  32k only work with the latest kernels.

Why not just use a later kernel?  That is the simpler solution.  As far
as I can tell, that would mean that some devices that need libusb are
just broken, but the majority of in-kernel drivers work fine.  I'm ok
with telling users to upgrade their kernel and use the higher packet
splitting limit in order to get their devices to work.

1) Raising the limit to 32 kb will fix the issue I've been investigating,
but not: http://libusb.org/ticket/133, which likely is the same issue,
fixing that one would require raising the split limit to 128k, which brings
us to the question how far should we raise the split limit ?

2) The USBDEVFS_URB_BULK_CONTINUATION flag is part of our official
API and this should just work as advertised!

Regards,

Hans
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html