Hi, andy_purcell@xxxxxxxxxxxx writes: > I have implemented a USB device function using Linux functionfs and > now there is a problem being reported. > > I need to ask this group for advice. > > The problem is this: > 1) device boots > > 2) some usb transfers happen, all are OK > > 3) a device app runs to completion (USB quiescent during this time, no > USB transfers required) > > 4) the controlling PC starts a 4 KByte USB transfer to the device, but > this transfer does not finish. Only 3 Kbytes are ACK'd by the device. > > (A USB analyzer shows the host trying to send more, but the > device persistently NAK's) > > If step (3) is omitted, everything works fine. It is reliable - 15/15 > times it is OK. > > The USB device function is implemented with functionfs and aio. Most > of the implementation is in user space. > > An off-the-shelf low level Linux driver is being used. > > Regression tests show no problems with various sized USB transfers for > over 24 hours. Okay, let's try to figure out what's going on. Are you using dwc3, by any chance? If you are, can you capture tracepoints of the failing case? While it could be something on the application side, I want to be sure the controller is behaving properly. For details on how to capture tracepoints, see [1] below. > A colleague has investigated and has asserted user space is not the > right way to do things. > > He says: > > "It appeared that running the <device app> was enough to swap the usb > code out that it wasn't able to swap back in quick enough to respond > to the USB traffic in a timely fashion" .... "This is the major > drawback to user space drivers as opposed to kernel drivers. Kernel > drivers pages are locked into memory while user space can be swapped > out. There were numerous articles about this, but the best one I > found was: > > http://www.makelinux.net/ldd3/chp-2-sect-9 " > > Linux Device Drivers, 3rd Edition, By Jonathan Corbet, > Greg Kroah-Hartman, Alessandro Rubini : February 2005 > > "There pertinent part is: > > o Response time is slower, because a context switch is required to > transfer information or actions between the client and the hardware. > > o Worse yet, if the driver has been swapped to disk, response time > is unacceptably long. Using the mlock system call might help, but > usually you'll need to lock many memory pages, because a user-space > program depends on a lot of library code. mlock, too, is limited to > privileged users. > > Some articles I read stated that the swap could take seconds." > > > QUESTIONS: > > - Did I make a mistake using user space and functionfs? > (I thought state-of-the-art way to do usb function drivers was to > use functionfs...) right, unless you can use some of the in-tree functions, it doesn't make sure to rely on an ever-changing internal API :-) > - Should I add calls to mlock() to try to fix? that's an easy enough test, yes :-) > Any advice is appreciated. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/driver-api/usb/dwc3.rst#n113 -- balbi
Attachment:
signature.asc
Description: PGP signature