On 06 Oct 2015, at 12:34, Paul Jones <p.jones@xxxxxxxxxx> wrote: > On 06 Oct 2015, at 16:44, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > >> On Tue, 6 Oct 2015, Paul Jones wrote: >> >>> On 05 Oct 2015, at 23:09, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: >>> >>>> On Mon, 5 Oct 2015, Paul Jones wrote: >>>> >>>>>> Increasing the max_sectors_kb value, on the other hand, might remove >>>>>> overhead by allowing a higher percentage of the transfer to consist of >>>>>> real data as opposed to CBW and CSW packets. This depends to some >>>>>> extent on other factors (such as whether the memory pages are >>>>>> contiguous, allowing for larger transfers), but it can't hurt. >>>>> >>>>> I tried changing the max_sectors_kb to 64 with 64k block size in dd and it�s transferring at the same speed. >>>> >>>> That's a decrease, not an increase. Try changing it to 1024 or more. >>> I can�t increase the value, any value over 120 is rejected. >>> I therefore decided to see if the speed would decrease by decreasing the block size, which doesn�t seem to be the case. >>> Is there a setting somewhere that limits the max_sectors_kb value? >> >> Yes, there is. I forgot about this hard limit. I think you can change >> it by writing to /sys/bus/usb/devices/.../max_sectors, where ... is the >> path for the mass-storage interface of your device. > I changed /sys/block/<device>/device/max_sectors to 4096 and /sys/block/<device>/queue/max_sectors_kb to 2048 > That improves matters slightly from 140MB/s to 160MB/s. > > Using Paul Zimmerman’s suggestion I can increase that to 174MB/s using a 160k buffer. > However changing to a tmpfs backing file for my gadget makes no difference in speed at all. > I guess that means that the delays are actually part of the gadget and/or f_mass_storage implementation. > >>> In my current setup I have 35us overhead for responding to the CBW >>> and CSW requests (70 us total) and there seems to be some delay in >>> the bulk transmission of the data as well as the transmission time >>> for the data is much slower than 5Gb/s. >> >> All those things take time. No doubt some of the delay is the time >> required to read the data from the backing file. Most of the rest is >> transmission time. >> >> Note that you can never actually attain 5 Gb/s, even under the best >> circumstances. For one thing, data on the USB bus is encoded in a way >> that uses 10 bits on the bus to send 8 bits of data. So you could >> never achieve more than 500 MB/s even if there were no packet headers, >> breaks between packets, and so on. > I would be happy with anything over 300MB/s for sequential transfers. > >> >>> Is there a way to trace the USB frames to see where the delays are >>> occurring during the actual data transfer? >> >> Sure, if you have a USB bus analyzer. There's no way to do it in >> software alone. > Couldn’t we at least use interrupts to timestamp when individual frames are sent/received or when DMA starts/completes? Added some debugging statements in f_mass_storage/net2280 to get an idea of what is going on on the wire (as I unfortunately don’t have any tools to figure it out any other way). Additional debug statement locations: - net2280_queue: received request is at the beginning of the function - net2280_queue: starting request is right after the spin lock is acquired - net2280_queue: finished is just before the spin lock is released - mass_storage.1/lun.0: file read size @ offset is just before vfs_read - mass_storage.1/lun.0: file read size @ offset -> read is just after the vfs_read - configfs-gadget gadget: bulk-in/bulk-out wait is just before the spin lock in start_transfer - configfs-gadget gadget: bulk-in/bulk-out start is right after the spin lock is acquired in start_transfer - configfs-gadget gadget: bulk-in/bulk-out queued is right after the call to usb_ep_queue in start_transfer - configfs-gadget gadget: bulk-in/bulk-out completed is at the beginning of bulk_in_completed/bulk_out_completed Log for a single large read request: Oct 16 13:10:27 usbserver kernel: [ 176.372445] configfs-gadget gadget: SCSI command: READ(10); Dc=10, Di=122880; Hc=10, Hi=122880 Oct 16 13:10:27 usbserver kernel: [ 176.372446] mass_storage.1/lun.0: file read 16384 @ 268435456 Oct 16 13:10:27 usbserver kernel: [ 176.372448] mass_storage.1/lun.0: file read 16384 @ 268435456 -> 16384 Oct 16 13:10:27 usbserver kernel: [ 176.372448] configfs-gadget gadget: bulk-in wait Oct 16 13:10:27 usbserver kernel: [ 176.372448] configfs-gadget gadget: bulk-in start Oct 16 13:10:27 usbserver kernel: [ 176.372448] net2280_queue: received request Oct 16 13:10:27 usbserver kernel: [ 176.372449] net2280_queue: starting request Oct 16 13:10:27 usbserver kernel: [ 176.372455] net2280_queue: finished Oct 16 13:10:27 usbserver kernel: [ 176.372455] configfs-gadget gadget: bulk-in queued Oct 16 13:10:27 usbserver kernel: [ 176.372456] mass_storage.1/lun.0: file read 16384 @ 268451840 Oct 16 13:10:27 usbserver kernel: [ 176.372458] mass_storage.1/lun.0: file read 16384 @ 268451840 -> 16384 Oct 16 13:10:27 usbserver kernel: [ 176.372458] configfs-gadget gadget: bulk-in wait Oct 16 13:10:27 usbserver kernel: [ 176.372459] configfs-gadget gadget: bulk-in start Oct 16 13:10:27 usbserver kernel: [ 176.372459] net2280_queue: received request Oct 16 13:10:27 usbserver kernel: [ 176.372460] net2280_queue: starting request Oct 16 13:10:27 usbserver kernel: [ 176.372460] net2280_queue: finished Oct 16 13:10:27 usbserver kernel: [ 176.372460] configfs-gadget gadget: bulk-in queued Oct 16 13:10:27 usbserver kernel: [ 176.372460] mass_storage.1/lun.0: file read 16384 @ 268468224 Oct 16 13:10:27 usbserver kernel: [ 176.372463] mass_storage.1/lun.0: file read 16384 @ 268468224 -> 16384 Oct 16 13:10:27 usbserver kernel: [ 176.372463] configfs-gadget gadget: bulk-in wait Oct 16 13:10:27 usbserver kernel: [ 176.372463] configfs-gadget gadget: bulk-in start Oct 16 13:10:27 usbserver kernel: [ 176.372463] net2280_queue: received request Oct 16 13:10:27 usbserver kernel: [ 176.372464] net2280_queue: starting request Oct 16 13:10:27 usbserver kernel: [ 176.372464] net2280_queue: finished Oct 16 13:10:27 usbserver kernel: [ 176.372465] configfs-gadget gadget: bulk-in queued Oct 16 13:10:27 usbserver kernel: [ 176.372465] mass_storage.1/lun.0: file read 16384 @ 268484608 Oct 16 13:10:27 usbserver kernel: [ 176.372467] mass_storage.1/lun.0: file read 16384 @ 268484608 -> 16384 Oct 16 13:10:27 usbserver kernel: [ 176.372467] configfs-gadget gadget: bulk-in wait Oct 16 13:10:27 usbserver kernel: [ 176.372468] configfs-gadget gadget: bulk-in start Oct 16 13:10:27 usbserver kernel: [ 176.372468] net2280_queue: received request Oct 16 13:10:27 usbserver kernel: [ 176.372469] net2280_queue: starting request Oct 16 13:10:27 usbserver kernel: [ 176.372469] net2280_queue: finished Oct 16 13:10:27 usbserver kernel: [ 176.372469] configfs-gadget gadget: bulk-in queued Oct 16 13:10:27 usbserver kernel: [ 176.372469] mass_storage.1/lun.0: file read 16384 @ 268500992 Oct 16 13:10:27 usbserver kernel: [ 176.372471] mass_storage.1/lun.0: file read 16384 @ 268500992 -> 16384 Oct 16 13:10:27 usbserver kernel: [ 176.372472] configfs-gadget gadget: bulk-in wait Oct 16 13:10:27 usbserver kernel: [ 176.372472] configfs-gadget gadget: bulk-in start Oct 16 13:10:27 usbserver kernel: [ 176.372472] net2280_queue: received request Oct 16 13:10:27 usbserver kernel: [ 176.372473] net2280_queue: starting request Oct 16 13:10:27 usbserver kernel: [ 176.372473] net2280_queue: finished Oct 16 13:10:27 usbserver kernel: [ 176.372474] configfs-gadget gadget: bulk-in queued Oct 16 13:10:27 usbserver kernel: [ 176.372474] mass_storage.1/lun.0: file read 16384 @ 268517376 Oct 16 13:10:27 usbserver kernel: [ 176.372476] mass_storage.1/lun.0: file read 16384 @ 268517376 -> 16384 Oct 16 13:10:27 usbserver kernel: [ 176.372476] configfs-gadget gadget: bulk-in wait Oct 16 13:10:27 usbserver kernel: [ 176.372477] configfs-gadget gadget: bulk-in start Oct 16 13:10:27 usbserver kernel: [ 176.372477] net2280_queue: received request Oct 16 13:10:27 usbserver kernel: [ 176.372478] net2280_queue: starting request Oct 16 13:10:27 usbserver kernel: [ 176.372478] net2280_queue: finished Oct 16 13:10:27 usbserver kernel: [ 176.372478] configfs-gadget gadget: bulk-in queued Oct 16 13:10:27 usbserver kernel: [ 176.372479] mass_storage.1/lun.0: file read 16384 @ 268533760 Oct 16 13:10:27 usbserver kernel: [ 176.372481] mass_storage.1/lun.0: file read 16384 @ 268533760 -> 16384 Oct 16 13:10:27 usbserver kernel: [ 176.372481] configfs-gadget gadget: bulk-in wait Oct 16 13:10:27 usbserver kernel: [ 176.372481] configfs-gadget gadget: bulk-in start Oct 16 13:10:27 usbserver kernel: [ 176.372481] net2280_queue: received request Oct 16 13:10:27 usbserver kernel: [ 176.372482] net2280_queue: starting request Oct 16 13:10:27 usbserver kernel: [ 176.372482] net2280_queue: finished Oct 16 13:10:27 usbserver kernel: [ 176.372483] configfs-gadget gadget: bulk-in queued Oct 16 13:10:27 usbserver kernel: [ 176.372483] mass_storage.1/lun.0: file read 8192 @ 268550144 Oct 16 13:10:27 usbserver kernel: [ 176.372484] mass_storage.1/lun.0: file read 8192 @ 268550144 -> 8192 Oct 16 13:10:27 usbserver kernel: [ 176.372485] configfs-gadget gadget: bulk-in wait Oct 16 13:10:27 usbserver kernel: [ 176.372485] configfs-gadget gadget: bulk-in start Oct 16 13:10:27 usbserver kernel: [ 176.372485] net2280_queue: received request Oct 16 13:10:27 usbserver kernel: [ 176.372486] net2280_queue: starting request Oct 16 13:10:27 usbserver kernel: [ 176.372486] net2280_queue: finished Oct 16 13:10:27 usbserver kernel: [ 176.372486] configfs-gadget gadget: bulk-in queued Oct 16 13:10:27 usbserver kernel: [ 176.372544] configfs-gadget gadget: bulk-in completed Oct 16 13:10:27 usbserver kernel: [ 176.372554] configfs-gadget gadget: bulk-in wait Oct 16 13:10:27 usbserver kernel: [ 176.372554] configfs-gadget gadget: bulk-in start Oct 16 13:10:27 usbserver kernel: [ 176.372554] net2280_queue: received request Oct 16 13:10:27 usbserver kernel: [ 176.372555] net2280_queue: starting request Oct 16 13:10:27 usbserver kernel: [ 176.372555] net2280_queue: finished Oct 16 13:10:27 usbserver kernel: [ 176.372555] configfs-gadget gadget: bulk-in queued Oct 16 13:10:27 usbserver kernel: [ 176.372640] configfs-gadget gadget: bulk-in completed Oct 16 13:10:27 usbserver kernel: [ 176.372649] configfs-gadget gadget: bulk out wait Oct 16 13:10:27 usbserver kernel: [ 176.372649] configfs-gadget gadget: bulk out start Oct 16 13:10:27 usbserver kernel: [ 176.372649] net2280_queue: received request Oct 16 13:10:27 usbserver kernel: [ 176.372650] net2280_queue: starting request Oct 16 13:10:27 usbserver kernel: [ 176.372663] net2280_queue: finished Oct 16 13:10:27 usbserver kernel: [ 176.372664] configfs-gadget gadget: bulk out queued Oct 16 13:10:27 usbserver kernel: [ 176.372735] configfs-gadget gadget: bulk-in completed Oct 16 13:10:27 usbserver kernel: [ 176.372833] configfs-gadget gadget: bulk-in completed Oct 16 13:10:27 usbserver kernel: [ 176.372929] configfs-gadget gadget: bulk-in completed Oct 16 13:10:27 usbserver kernel: [ 176.373024] configfs-gadget gadget: bulk-in completed Oct 16 13:10:27 usbserver kernel: [ 176.373120] configfs-gadget gadget: bulk-in completed Oct 16 13:10:27 usbserver kernel: [ 176.373182] configfs-gadget gadget: bulk-in completed Oct 16 13:10:27 usbserver kernel: [ 176.373201] configfs-gadget gadget: bulk-in completed Oct 16 13:10:27 usbserver kernel: [ 176.373244] configfs-gadget gadget: bulk-out completed Oct 16 13:10:27 usbserver kernel: [ 176.373258] SCSI CDB: 00 00 00 00 00 00 The above is using the standard buffer size of 16k with 8 buffers. f_mass_storage performs a vfs_read and waits for it’s result before performing a start_in_transfer. The start_in_transfer doesn’t return until the transfer is queued. If the read request is large enough to require multiple buffers then the next vfs_read is only issued after the transfer is queued. Occasionally a vfs_read requests is slow (20+us) but the other calls are relatively fast (2-3us) for the same transfer size, which is presumably due to read ahead in the vfs layer. Similarly queuing some of the bulk transfers is very slow (4-15us), but others are fast (1-2us). The slow queuing of transfers seems to be due to waiting in a spin lock on ep->dev->lock (4+ us). I’m guessing this is caused by the fact that there sometimes seems to be significant time between when the spin lock is acquired in net2280_queue and when it is released. Any idea why that may be? It seems to take about 95-100us from the first usb_ep_queue to bulk_in_complete for 16384 bytes of payload. Subsequent bulk_in_complete events are returned at a 95-100us between them. Any ideas why that may be? Given that most of the bulk-in requests are overlapping, the underlying hardware shouldn’t have to wait for anything except for the host stalling the data reception (which I think is unlikely). I noticed that the done function in net2280 releases and requires the ep->dev->lock around the call to the completion callback. Given that sometimes acquiring the lock takes a long time, we could be stalling interrupts for a significant amount of time as well. I’ll try to add some logging in the irq handler to see how long this are taking... Paul. -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html