On Fri, 2 Oct 2015, Felipe Balbi wrote: > Hi Alan, > > here's a question which I hope you can help me understand :-) > > Why do we have that kthread for the mass storage gadgets ? I noticed a very > interesting behavior. Basically, the mass-storage gadget uses a kthread in order to access the backing file. Obviously a kernel driver doesn't _need_ to use a separate thread to read or write a file (think of /dev/loop, for example). But doing it that way is very easy because the driver merely has to imitate read(2) and write(2) system calls, which have a simple and well defined interface. To do anything more direct would mean getting deeply involved in the block and filesystem layers, and that wasn't my goal -- I wanted to write a USB gadget driver, not re-implement the loop driver. Also, remember that at the time the MS Gadget was written, speed over USB wasn't a pressing issue. You couldn't transfer more than about 40 MB/s anyway, so efficiency in the gadget wasn't a terribly high priority. > Basically, the MSC class works in a loop like so: > > CBW > Data Transfer > CSW > > In our implemention, what we do is: > > CBW > wake_up_process() > Data Transfer > wake_up_process() > CSW > wake_up_process() The wake_up_process() call is the notification from the completion routine telling the kthread that a request has completed. The kthread doesn't necessarily stop and wait for these notifications; it can continue processing in parallel. > Now here's the interesting bit. Every time we wake_up_process(), we basically > don't do anything until MSC's kthread gets finally scheduled and has a chance of > doing its job. That's not entirely true; the kthread may already be running. For example, the default MS gadget uses two I/O buffers (there's a Kconfig option to use more if you want). When carrying out a READ, the kthread fills up the first buffer and submits it to the UDC. But it doesn't stop and wait for wake_up_process(); instead it goes on to fill up the second buffer while the first is being sent back to the host. The wake_up_process() may very well occur while the second buffer is being filled, in which case it won't do anything (since the kthread isn't asleep). > This means that the host keeps sending us tokens but the UDC > doesn't have any request queued to start a transfer. This happens specially with > IN endpoints, not so much on OUT directions. See figure one [1] we can see that > host issues over 7 POLLs before UDC has finally started a usb_request, sometimes > this goes for even longer (see image [3]). Figure [1] is misleading. The 7 POLLs you see are at the very start of a READ. It's not surprising that the host can poll 7 times before the gadget manages to read the first block of data from the backing file. This isn't a case where the kthread could have been doing something useful instead of waiting around to hear from the host. Figure [3] is difficult to interpret because it doesn't include the transfer lengths. I can't tell what was going on during the 37 + 50 POLLs before the first OUT transfer. If that was a CDB starting a new command, I don't see why it would take that long to schedule the kthread unless the CPU was busy with other tasks. > On figure two we can see that on this particular session, I had as much as 15% > of the bandwidth wasted on POLLs. With this current setup I'm 34MB/sec and with > the added 15% that would get really close to 40MB/sec. So high speed, right? Are the numbers in the figure handshake _counts_ or handshake _times_? A simple NAK doesn't use much bandwidth. Even if 15% of the handshakes are NAKs, it doesn't mean you're wasting 15% of the bandwidth. > So the question is, why do we have to wait for that kthread to get scheduled ? > Why couldn't we skip it completely ? Is there really anything left in there that > couldn't be done from within usb_request->complete() itself ? The real answer is the calls to vfs_read() and vfs_write() -- those have to occur in process context. In theory, the CBW packet doesn't have to be processed by the kthread. But processing it in the completion routine wouldn't help, because the kthread would still have to be scheduled in order to carry out the READ or WRITE command. In other words, changing: Completion handler Kthread ------------------ ------- Receive CBW Wake up kthread Process CBW Start READ or WRITE into: Completion handler Kthread ------------------ ------- Receive CBW Process CBW Wake up kthread Start READ or WRITE doesn't really give any significant advantage. > I'll spend some time on that today and really dig that thing up, but if you know > the answer off the top of your head, I'd be happy to hear. I hope this explains the issues clearly enough. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html