Re: usbdevfs: BULK_CONTINUATION flag does not work with XHCI controller

Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> · Mon, 4 Jun 2012 10:15:55 -0400 (EDT)

On Mon, 4 Jun 2012, Hans de Goede wrote:

> >> Hi,
> >>
> >> I spend the last 1.5 days debugging a problem with redirecting
> >> certain USB mass-storage devices to a Windows 7 vm.
> >>
> >> This problem only happens when the device is plugged into an
> >> USB 3 port.
> >>
> >> I believe I now know what is happening:
> >>
> >> First the working sequence:
> >> 1) Windows guest writes some bulk data to the msd
> >> 2) Windows guest tries to read up to 20480 bytes, only reads 36 bytes
> >> 3) Windows guest tries to read up to 512 bytes, only reads 13 bytes
> >>
> >> The non working sequence:
> >> 1) Windows guest writes some bulk data to the msd
> >> 2) Windows guest tries to read up to 20480 bytes, only reads 49 bytes
> >> 3) Windows guest tries to read up to 512 bytes ->  timeout
> >>
> >> Here is what I believe is happening, at 2 libusb splits the
> >> bulk in transfer into 2 packets, 1 of 16k and 1 of 4k, setting the
> >> USBDEVFS_URB_SHORT_NOT_OK flag on both and the
> >> USBDEVFS_URB_BULK_CONTINUATION flag on the second.
> >>
> >> With the EHCI controller all then works as intended, the first packet
> >> transfers 36 bytes and returns a status of EREMOTEIO, the second
> >> packet gets cancelled by drivers/usb/core/devio.c, transferring 0 bytes
> >> and returns a status of ECONNRESET and we all live happily ever after :)
> >>
> >> With the XHCI controller however, the first packet transfers 36 bytes
> >> and returns a status of EREMOTEIO, as it should, but the second
> >> packet transfers the 13 next bytes and returns a status of ECONNRESET.
> >
> > So you're fine with the second packet getting the ECONNRESET status
> > returned, but you don't want the 13 bytes transferred.  You want zero
> > bytes transferred.
> 
> Correct.

Not exactly.  The 13 bytes should indeed be transferred; the problem is
that they are getting transferred to the wrong URB.

> > I'm confused
> > about the flow of buffers and URBs between userspace and kernel.  Please
> > provide a more detailed example.
> 
> 1) userspace libusb app submits a bulk in transfer which wants to read > 16k
> 2) libusb splits this in multiple urbs, setting the USBDEVFS_URB_SHORT_NOT_OK
>     flag on all urbs and the  USBDEVFS_URB_BULK_CONTINUATION flag on all but the
>     first.
> 3) devio.c processes these submitting them all to the controller, remembering the
>     USBDEVFS_URB_BULK_CONTINUATION flag per urb
> 4) Device returns a short packet (less then requested on the first packet)
> 5) According to Alan Stern, with an EHCI controller, the controller halts the ep
>     because of the USBDEVFS_URB_SHORT_NOT_OK flag + 4).
> 6) devio.c walk its list of pending urbs, cancelling all urbs for the same device
>     + ep which have the USBDEVFS_URB_BULK_CONTINUATION flag set (iow stopping at the
>     first urb for the same ep without the USBDEVFS_URB_BULK_CONTINUATION flag set.
> 
> If I understood Alan correctly he believes the XHCI controller not doing 5 is the
> problem.

Yes.

> Looking at this again I think I've found a (related) bug in devio.c, if another
> (not a continued, but a completely different) bulk transfer has already been
> submitted to the ep, who / what is going to restart the ep, after the urbs which
> belong to the short transfer have been canceled ?

The HCD automatically restarts the endpoint when the completion handler 
returns.

> >> Thus the second packets has consumed the data the device had ready
> >> which should have been read by the next bulk transfer from the guest
> >> pov.
> >>
> >> All in all it seems that the cancel of further packets done by
> >> drivers/usb/core/devio.c: cancel_bulk_urbs()
> >> comes too late when the device is on an XHCI controller, it seems that
> >> the controller is already "executing" the next bulk transfer *before*
> >> the completion handler of the previous one has completed.
> >
> > Yes, I believe that's true.  If the packet is already queued on the
> > endpoint ring, the xHCI driver is going to start transferring that
> > packet as soon as the previous transfer completes.
> 
> Right, so if I understood correctly, devio assumes the EP ring will stop
> automatically on a short transfer when the urb has the
> USBDEVFS_URB_SHORT_NOT_OK flag set.

It's not just an assumption; this behavior is documented.  The HCD is 
obligated to do it.

> > It's really odd that this behavior worked under EHCI.  AFAICT, the
> > USBDEVFS_URB_BULK_CONTINUATION flag isn't even exposed to the host
> > controller (only to usbfs), so I don't see how canceling the bulk
> > command under EHCI could even work.
> 
> devio assumes that the ep stops on the first short read if
> USBDEVFS_URB_SHORT_NOT_OK is set, and for EHCI this seems to be the
> case.

Yes.  Also for UHCI, OHCI, and dummy-hcd.  Presumably most of the other 
HCDs too, though I'm not familiar with their details.

> > Or why can't users just migrate to new
> > kernels that have the total usbfs upper limit, and not the per packet
> > limit?
> 
> Because professional users often need support, which means using one of
> the enterprise distros (RHEL-6 in this case), which don't have a new
> enough kernel.
> 
> Also bulk transfers can be up to 4 MB (as a minimum I could not find
> a hard upper limit for them) and at some point it starts making sense to
> split them, ie maybe have part of the urbs for a 4 MB transfer queued up
> and as some complete add more. Then as the first transfer nears completion
> start queuing up some urbs for the next 4 MB transfer, etc. When doing
> things like that from multiple apps at the same time you will soon
> run into limitations even with the new kernel limits. So at some time
> it will always start making sense to split the transfers, so we need
> splitting up transfers to work.

In addition, allocating a large (i.e., 4 MB or more) area of kernel 
memory is not to be done lightly.  It's a lot more likely to fail than 
allocating 256 areas of 16 KB each.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html