On Mon, 4 Jun 2012, Hans de Goede wrote: > >> Hi, > >> > >> I spend the last 1.5 days debugging a problem with redirecting > >> certain USB mass-storage devices to a Windows 7 vm. > >> > >> This problem only happens when the device is plugged into an > >> USB 3 port. > >> > >> I believe I now know what is happening: > >> > >> First the working sequence: > >> 1) Windows guest writes some bulk data to the msd > >> 2) Windows guest tries to read up to 20480 bytes, only reads 36 bytes > >> 3) Windows guest tries to read up to 512 bytes, only reads 13 bytes > >> > >> The non working sequence: > >> 1) Windows guest writes some bulk data to the msd > >> 2) Windows guest tries to read up to 20480 bytes, only reads 49 bytes > >> 3) Windows guest tries to read up to 512 bytes -> timeout > >> > >> Here is what I believe is happening, at 2 libusb splits the > >> bulk in transfer into 2 packets, 1 of 16k and 1 of 4k, setting the > >> USBDEVFS_URB_SHORT_NOT_OK flag on both and the > >> USBDEVFS_URB_BULK_CONTINUATION flag on the second. > >> > >> With the EHCI controller all then works as intended, the first packet > >> transfers 36 bytes and returns a status of EREMOTEIO, the second > >> packet gets cancelled by drivers/usb/core/devio.c, transferring 0 bytes > >> and returns a status of ECONNRESET and we all live happily ever after :) > >> > >> With the XHCI controller however, the first packet transfers 36 bytes > >> and returns a status of EREMOTEIO, as it should, but the second > >> packet transfers the 13 next bytes and returns a status of ECONNRESET. > > > > So you're fine with the second packet getting the ECONNRESET status > > returned, but you don't want the 13 bytes transferred. You want zero > > bytes transferred. > > Correct. Not exactly. The 13 bytes should indeed be transferred; the problem is that they are getting transferred to the wrong URB. > > I'm confused > > about the flow of buffers and URBs between userspace and kernel. Please > > provide a more detailed example. > > 1) userspace libusb app submits a bulk in transfer which wants to read > 16k > 2) libusb splits this in multiple urbs, setting the USBDEVFS_URB_SHORT_NOT_OK > flag on all urbs and the USBDEVFS_URB_BULK_CONTINUATION flag on all but the > first. > 3) devio.c processes these submitting them all to the controller, remembering the > USBDEVFS_URB_BULK_CONTINUATION flag per urb > 4) Device returns a short packet (less then requested on the first packet) > 5) According to Alan Stern, with an EHCI controller, the controller halts the ep > because of the USBDEVFS_URB_SHORT_NOT_OK flag + 4). > 6) devio.c walk its list of pending urbs, cancelling all urbs for the same device > + ep which have the USBDEVFS_URB_BULK_CONTINUATION flag set (iow stopping at the > first urb for the same ep without the USBDEVFS_URB_BULK_CONTINUATION flag set. > > If I understood Alan correctly he believes the XHCI controller not doing 5 is the > problem. Yes. > Looking at this again I think I've found a (related) bug in devio.c, if another > (not a continued, but a completely different) bulk transfer has already been > submitted to the ep, who / what is going to restart the ep, after the urbs which > belong to the short transfer have been canceled ? The HCD automatically restarts the endpoint when the completion handler returns. > >> Thus the second packets has consumed the data the device had ready > >> which should have been read by the next bulk transfer from the guest > >> pov. > >> > >> All in all it seems that the cancel of further packets done by > >> drivers/usb/core/devio.c: cancel_bulk_urbs() > >> comes too late when the device is on an XHCI controller, it seems that > >> the controller is already "executing" the next bulk transfer *before* > >> the completion handler of the previous one has completed. > > > > Yes, I believe that's true. If the packet is already queued on the > > endpoint ring, the xHCI driver is going to start transferring that > > packet as soon as the previous transfer completes. > > Right, so if I understood correctly, devio assumes the EP ring will stop > automatically on a short transfer when the urb has the > USBDEVFS_URB_SHORT_NOT_OK flag set. It's not just an assumption; this behavior is documented. The HCD is obligated to do it. > > It's really odd that this behavior worked under EHCI. AFAICT, the > > USBDEVFS_URB_BULK_CONTINUATION flag isn't even exposed to the host > > controller (only to usbfs), so I don't see how canceling the bulk > > command under EHCI could even work. > > devio assumes that the ep stops on the first short read if > USBDEVFS_URB_SHORT_NOT_OK is set, and for EHCI this seems to be the > case. Yes. Also for UHCI, OHCI, and dummy-hcd. Presumably most of the other HCDs too, though I'm not familiar with their details. > > Or why can't users just migrate to new > > kernels that have the total usbfs upper limit, and not the per packet > > limit? > > Because professional users often need support, which means using one of > the enterprise distros (RHEL-6 in this case), which don't have a new > enough kernel. > > Also bulk transfers can be up to 4 MB (as a minimum I could not find > a hard upper limit for them) and at some point it starts making sense to > split them, ie maybe have part of the urbs for a 4 MB transfer queued up > and as some complete add more. Then as the first transfer nears completion > start queuing up some urbs for the next 4 MB transfer, etc. When doing > things like that from multiple apps at the same time you will soon > run into limitations even with the new kernel limits. So at some time > it will always start making sense to split the transfers, so we need > splitting up transfers to work. In addition, allocating a large (i.e., 4 MB or more) area of kernel memory is not to be done lightly. It's a lot more likely to fail than allocating 256 areas of 16 KB each. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html