On 16 Oct 2015, at 15:59, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > On Fri, 16 Oct 2015, Paul Jones wrote: > >> Added some debugging statements in f_mass_storage/net2280 to get an idea of what is going on on the wire (as I unfortunately don’t have any tools to figure it out any other way). >> Additional debug statement locations: > ... > >> Log for a single large read request: > ... > >> The above is using the standard buffer size of 16k with 8 buffers. >> >> f_mass_storage performs a vfs_read and waits for it’s result before performing a start_in_transfer. >> The start_in_transfer doesn’t return until the transfer is queued. >> If the read request is large enough to require multiple buffers then the next vfs_read is only issued after the transfer is queued. >> >> Occasionally a vfs_read requests is slow (20+us) but the other calls are relatively fast (2-3us) for the same transfer size, which is presumably due to read ahead in the vfs layer. >> Similarly queuing some of the bulk transfers is very slow (4-15us), but others are fast (1-2us). > > Some of these delays you see may be due to interrupts coming from other > devices. There's not much you can do about those. > >> The slow queuing of transfers seems to be due to waiting in a spin lock on ep->dev->lock (4+ us). >> I’m guessing this is caused by the fact that there sometimes seems to be significant time between when the spin lock is acquired in net2280_queue and when it is released. >> Any idea why that may be? > > There shouldn't be any contention on that spinlock -- at least, not > until the driver's interrupt handler needs to run. Everything else in > the mass-storage gadget is single-threaded. I don't know why it should > take so long to acquire the spinlock when nothing else is happening, > unless once again you're seeing interrupts from other devices. I am talking about the net2280 lock and not the mass_storage lock. > >> It seems to take about 95-100us from the first usb_ep_queue to bulk_in_complete for 16384 bytes of payload. >> Subsequent bulk_in_complete events are returned at a 95-100us between them. >> Any ideas why that may be? > > That's the time it takes to actually send the data across the USB bus. > You appear to be getting about 1/3 the maximum possible throughput, > since the maximum data rate is 500 MB/s and you're getting 16384 B in > about 100 us. > >> Given that most of the bulk-in requests are overlapping, the underlying hardware shouldn’t have to wait for anything except for the host stalling the data reception (which I think is unlikely). > > Don't be so sure. The fact is, either the host's or the gadget's > hardware could easily be running at a sub-optimal speed. Hardware > implementations are rarely ideal. > >> I noticed that the done function in net2280 releases and requires the ep->dev->lock around the call to the completion callback. >> Given that sometimes acquiring the lock takes a long time, we could be stalling interrupts for a significant amount of time as well. >> I’ll try to add some logging in the irq handler to see how long this are taking... > > One thing you did not show here was the delay between sending the last > data buffer of one command and sending the first data buffer of the > next command. That overhead could significantly affect the overall > throughput. The next read command is generally handled within 20us after the previous one. Switched over to ftrace to include the net2280 irq/dma handling, logging of the spin locks and thread handling in bulk_in_complete: ----- FIRST REQUEST----- file-storage-1333 [002] .... 59.898705: do_read: READ: 122880 @ 0 file-storage-1333 [002] .... 59.898706: do_read: file read 16384 @ 0 file-storage-1333 [002] .... 59.898709: do_read: file read 16384 @ 0 -> 16384 file-storage-1333 [002] .... 59.898709: start_transfer: bulk-in start file-storage-1333 [002] .... 59.898710: net2280_queue: received request file-storage-1333 [002] d... 59.898711: net2280_queue: starting request ----- FIRST DMA ------ file-storage-1333 [002] d... 59.898712: start_dma: start file-storage-1333 [002] d... 59.898713: start_dma: start queue file-storage-1333 [002] d... 59.898717: start_dma: end file-storage-1333 [002] d... 59.898717: net2280_queue: finished file-storage-1333 [002] .... 59.898717: start_transfer: bulk-in queued file-storage-1333 [002] .... 59.898717: do_read: file read 16384 @ 16384 file-storage-1333 [002] .... 59.898720: do_read: file read 16384 @ 16384 -> 16384 file-storage-1333 [002] .... 59.898720: start_transfer: bulk-in start file-storage-1333 [002] .... 59.898720: net2280_queue: received request file-storage-1333 [002] d... 59.898721: net2280_queue: starting request file-storage-1333 [002] d... 59.898721: net2280_queue: finished file-storage-1333 [002] .... 59.898722: start_transfer: bulk-in queued file-storage-1333 [002] .... 59.898722: do_read: file read 16384 @ 32768 file-storage-1333 [002] .... 59.898724: do_read: file read 16384 @ 32768 -> 16384 file-storage-1333 [002] .... 59.898725: start_transfer: bulk-in start file-storage-1333 [002] .... 59.898725: net2280_queue: received request file-storage-1333 [002] d... 59.898726: net2280_queue: starting request file-storage-1333 [002] d... 59.898726: net2280_queue: finished file-storage-1333 [002] .... 59.898726: start_transfer: bulk-in queued file-storage-1333 [002] .... 59.898726: do_read: file read 16384 @ 49152 file-storage-1333 [002] .... 59.898737: do_read: file read 16384 @ 49152 -> 16384 file-storage-1333 [002] .... 59.898737: start_transfer: bulk-in start file-storage-1333 [002] .... 59.898737: net2280_queue: received request file-storage-1333 [002] d... 59.898738: net2280_queue: starting request file-storage-1333 [002] d... 59.898738: net2280_queue: finished file-storage-1333 [002] .... 59.898738: start_transfer: bulk-in queued file-storage-1333 [002] .... 59.898738: do_read: file read 16384 @ 65536 file-storage-1333 [002] .... 59.898741: do_read: file read 16384 @ 65536 -> 16384 file-storage-1333 [002] .... 59.898741: start_transfer: bulk-in start file-storage-1333 [002] .... 59.898741: net2280_queue: received request file-storage-1333 [002] d... 59.898742: net2280_queue: starting request file-storage-1333 [002] d... 59.898742: net2280_queue: finished file-storage-1333 [002] .... 59.898742: start_transfer: bulk-in queued file-storage-1333 [002] .... 59.898742: do_read: file read 16384 @ 81920 file-storage-1333 [002] .... 59.898744: do_read: file read 16384 @ 81920 -> 16384 file-storage-1333 [002] .... 59.898745: start_transfer: bulk-in start file-storage-1333 [002] .... 59.898745: net2280_queue: received request file-storage-1333 [002] d... 59.898746: net2280_queue: starting request file-storage-1333 [002] d... 59.898746: net2280_queue: finished file-storage-1333 [002] .... 59.898746: start_transfer: bulk-in queued file-storage-1333 [002] .... 59.898746: do_read: file read 16384 @ 98304 file-storage-1333 [002] .... 59.898749: do_read: file read 16384 @ 98304 -> 16384 file-storage-1333 [002] .... 59.898749: start_transfer: bulk-in start file-storage-1333 [002] .... 59.898749: net2280_queue: received request file-storage-1333 [002] d... 59.898750: net2280_queue: starting request file-storage-1333 [002] d... 59.898750: net2280_queue: finished file-storage-1333 [002] .... 59.898751: start_transfer: bulk-in queued file-storage-1333 [002] .... 59.898751: do_read: file read 8192 @ 114688 file-storage-1333 [002] .... 59.898753: do_read: file read 8192 @ 114688 -> 8192 file-storage-1333 [002] .... 59.898753: start_transfer: bulk-in start file-storage-1333 [002] .... 59.898753: net2280_queue: received request file-storage-1333 [002] d... 59.898754: net2280_queue: starting request file-storage-1333 [002] d... 59.898754: net2280_queue: finished file-storage-1333 [002] .... 59.898754: start_transfer: bulk-in queued ----- FIRST IRQ ----- <idle>-0 [002] d.h. 59.898828: net2280_irq: start <idle>-0 [002] d.h. 59.898829: net2280_irq: handle <idle>-0 [002] d.h. 59.898835: done: start <idle>-0 [002] d.h. 59.898836: done: calling complete callback <idle>-0 [002] d.h. 59.898836: bulk_in_complete: start <idle>-0 [002] d.h. 59.898836: bulk_in_complete: wakeup thread <idle>-0 [002] dNh. 59.898836: bulk_in_complete: wokeup thread <idle>-0 [002] dNh. 59.898837: bulk_in_complete: end <idle>-0 [002] dNh. 59.898837: done: complete callback done <idle>-0 [002] dNh. 59.898837: done: end ----- SECOND DMA ----- <idle>-0 [002] dNh. 59.898838: start_dma: start <idle>-0 [002] dNh. 59.898840: start_dma: start queue <idle>-0 [002] dNh. 59.898843: start_dma: end <idle>-0 [002] dNh. 59.898845: net2280_irq: end file-storage-1333 [002] .... 59.898847: start_transfer: bulk-in start file-storage-1333 [002] .... 59.898847: net2280_queue: received request file-storage-1333 [002] d... 59.898848: net2280_queue: starting request file-storage-1333 [002] d... 59.898848: net2280_queue: finished file-storage-1333 [002] .... 59.898848: start_transfer: bulk-in queued ----- SECOND IRQ ----- <idle>-0 [002] d.h. 59.898945: net2280_irq: start <idle>-0 [002] d.h. 59.898945: net2280_irq: handle <idle>-0 [002] d.h. 59.898952: done: start <idle>-0 [002] d.h. 59.898952: done: calling complete callback <idle>-0 [002] d.h. 59.898952: bulk_in_complete: start <idle>-0 [002] d.h. 59.898952: bulk_in_complete: wakeup thread <idle>-0 [002] dNh. 59.898953: bulk_in_complete: wokeup thread <idle>-0 [002] dNh. 59.898953: bulk_in_complete: end <idle>-0 [002] dNh. 59.898953: done: complete callback done <idle>-0 [002] dNh. 59.898953: done: end ----- THIRD DMA ----- <idle>-0 [002] dNh. 59.898954: start_dma: start <idle>-0 [002] dNh. 59.898955: start_dma: start queue <idle>-0 [002] dNh. 59.898956: start_dma: end <idle>-0 [002] dNh. 59.898959: net2280_irq: end file-storage-1333 [002] .... 59.898963: start_transfer: bulk-out start file-storage-1333 [002] .... 59.898963: net2280_queue: received request file-storage-1333 [002] d... 59.898963: net2280_queue: starting request file-storage-1333 [002] d... 59.898965: start_dma: start file-storage-1333 [002] d... 59.898980: start_dma: start queue file-storage-1333 [002] d... 59.898984: start_dma: end file-storage-1333 [002] d... 59.898984: net2280_queue: finished file-storage-1333 [002] .... 59.898985: start_transfer: bulk-out queued <idle>-0 [002] d.h. 59.899057: net2280_irq: start <idle>-0 [002] d.h. 59.899057: net2280_irq: handle <idle>-0 [002] d.h. 59.899063: done: start <idle>-0 [002] d.h. 59.899064: done: calling complete callback <idle>-0 [002] d.h. 59.899064: bulk_in_complete: start <idle>-0 [002] d.h. 59.899064: bulk_in_complete: wakeup thread <idle>-0 [002] dNh. 59.899064: bulk_in_complete: wokeup thread <idle>-0 [002] dNh. 59.899064: bulk_in_complete: end <idle>-0 [002] dNh. 59.899064: done: complete callback done <idle>-0 [002] dNh. 59.899064: done: end <idle>-0 [002] dNh. 59.899065: start_dma: start <idle>-0 [002] dNh. 59.899067: start_dma: start queue <idle>-0 [002] dNh. 59.899070: start_dma: end <idle>-0 [002] dNh. 59.899070: net2280_irq: end <idle>-0 [002] d.h. 59.899167: net2280_irq: start <idle>-0 [002] d.h. 59.899167: net2280_irq: handle <idle>-0 [002] d.h. 59.899174: done: start <idle>-0 [002] d.h. 59.899174: done: calling complete callback <idle>-0 [002] d.h. 59.899174: bulk_in_complete: start <idle>-0 [002] d.h. 59.899174: bulk_in_complete: wakeup thread <idle>-0 [002] dNh. 59.899174: bulk_in_complete: wokeup thread <idle>-0 [002] dNh. 59.899174: bulk_in_complete: end <idle>-0 [002] dNh. 59.899174: done: complete callback done <idle>-0 [002] dNh. 59.899174: done: end <idle>-0 [002] dNh. 59.899176: start_dma: start <idle>-0 [002] dNh. 59.899177: start_dma: start queue <idle>-0 [002] dNh. 59.899177: start_dma: end <idle>-0 [002] dNh. 59.899183: net2280_irq: end <idle>-0 [002] d.h. 59.899278: net2280_irq: start <idle>-0 [002] d.h. 59.899279: net2280_irq: handle <idle>-0 [002] d.h. 59.899285: done: start <idle>-0 [002] d.h. 59.899285: done: calling complete callback <idle>-0 [002] d.h. 59.899285: bulk_in_complete: start <idle>-0 [002] d.h. 59.899285: bulk_in_complete: wakeup thread <idle>-0 [002] dNh. 59.899286: bulk_in_complete: wokeup thread <idle>-0 [002] dNh. 59.899286: bulk_in_complete: end <idle>-0 [002] dNh. 59.899286: done: complete callback done <idle>-0 [002] dNh. 59.899286: done: end <idle>-0 [002] dNh. 59.899287: start_dma: start <idle>-0 [002] dNh. 59.899288: start_dma: start queue <idle>-0 [002] dNh. 59.899288: start_dma: end <idle>-0 [002] dNh. 59.899292: net2280_irq: end <idle>-0 [002] d.h. 59.899401: net2280_irq: start <idle>-0 [002] d.h. 59.899401: net2280_irq: handle <idle>-0 [002] d.h. 59.899408: done: start <idle>-0 [002] d.h. 59.899408: done: calling complete callback <idle>-0 [002] d.h. 59.899408: bulk_in_complete: start <idle>-0 [002] d.h. 59.899408: bulk_in_complete: wakeup thread <idle>-0 [002] dNh. 59.899409: bulk_in_complete: wokeup thread <idle>-0 [002] dNh. 59.899409: bulk_in_complete: end <idle>-0 [002] dNh. 59.899409: done: complete callback done <idle>-0 [002] dNh. 59.899409: done: end <idle>-0 [002] dNh. 59.899410: start_dma: start <idle>-0 [002] dNh. 59.899411: start_dma: start queue <idle>-0 [002] dNh. 59.899411: start_dma: end <idle>-0 [002] dNh. 59.899414: net2280_irq: end <idle>-0 [002] d.h. 59.899529: net2280_irq: start <idle>-0 [002] d.h. 59.899529: net2280_irq: handle <idle>-0 [002] d.h. 59.899536: done: start <idle>-0 [002] d.h. 59.899536: done: calling complete callback <idle>-0 [002] d.h. 59.899536: bulk_in_complete: start <idle>-0 [002] d.h. 59.899536: bulk_in_complete: wakeup thread <idle>-0 [002] dNh. 59.899537: bulk_in_complete: wokeup thread <idle>-0 [002] dNh. 59.899537: bulk_in_complete: end <idle>-0 [002] dNh. 59.899537: done: complete callback done <idle>-0 [002] dNh. 59.899537: done: end <idle>-0 [002] dNh. 59.899538: start_dma: start <idle>-0 [002] dNh. 59.899539: start_dma: start queue <idle>-0 [002] dNh. 59.899539: start_dma: end <idle>-0 [002] dNh. 59.899542: net2280_irq: end <idle>-0 [002] d.h. 59.899609: net2280_irq: start <idle>-0 [002] d.h. 59.899609: net2280_irq: handle <idle>-0 [002] d.h. 59.899619: done: start <idle>-0 [002] d.h. 59.899619: done: calling complete callback <idle>-0 [002] d.h. 59.899619: bulk_in_complete: start <idle>-0 [002] d.h. 59.899619: bulk_in_complete: wakeup thread <idle>-0 [002] dNh. 59.899620: bulk_in_complete: wokeup thread <idle>-0 [002] dNh. 59.899620: bulk_in_complete: end <idle>-0 [002] dNh. 59.899620: done: complete callback done <idle>-0 [002] dNh. 59.899620: done: end <idle>-0 [002] dNh. 59.899621: start_dma: start <idle>-0 [002] dNh. 59.899622: start_dma: start queue <idle>-0 [002] dNh. 59.899622: start_dma: end <idle>-0 [002] dNh. 59.899625: net2280_irq: end <idle>-0 [002] dNh. 59.899630: net2280_irq: start <idle>-0 [002] dNh. 59.899630: net2280_irq: handle <idle>-0 [002] dNh. 59.899636: done: start <idle>-0 [002] dNh. 59.899637: done: calling complete callback <idle>-0 [002] dNh. 59.899637: bulk_in_complete: start <idle>-0 [002] dNh. 59.899637: bulk_in_complete: wakeup thread <idle>-0 [002] dNh. 59.899637: bulk_in_complete: wokeup thread <idle>-0 [002] dNh. 59.899637: bulk_in_complete: end <idle>-0 [002] dNh. 59.899637: done: complete callback done <idle>-0 [002] dNh. 59.899637: done: end <idle>-0 [002] dNh. 59.899640: net2280_irq: end <idle>-0 [002] dNh. 59.899641: net2280_irq: start <idle>-0 [002] dNh. 59.899641: net2280_irq: handle <idle>-0 [002] dNh. 59.899647: net2280_irq: end <idle>-0 [002] d.h. 59.899665: net2280_irq: start <idle>-0 [002] d.h. 59.899666: net2280_irq: handle <idle>-0 [002] d.h. 59.899680: done: start <idle>-0 [002] d.h. 59.899680: done: calling complete callback <idle>-0 [002] d.h. 59.899681: bulk_out_complete: start <idle>-0 [002] d.h. 59.899681: bulk_out_complete: wakeup thread <idle>-0 [002] dNh. 59.899681: bulk_out_complete: wokeup thread <idle>-0 [002] dNh. 59.899681: bulk_out_complete: end <idle>-0 [002] dNh. 59.899681: done: complete callback done <idle>-0 [002] dNh. 59.899681: done: end <idle>-0 [002] dNh. 59.899681: net2280_irq: end <idle>-0 [002] dNh. 59.899684: net2280_irq: start <idle>-0 [002] dNh. 59.899684: net2280_irq: handle <idle>-0 [002] dNh. 59.899693: net2280_irq: end <idle>-0 [002] dNh. 59.899694: net2280_irq: start <idle>-0 [002] dNh. 59.899694: net2280_irq: handle <idle>-0 [002] dNh. 59.899698: net2280_irq: end ----- SECOND REQUEST ----- file-storage-1333 [002] .... 59.899702: do_read: READ: 8192 @ 122880 file-storage-1333 [002] .... 59.899702: do_read: file read 8192 @ 122880 file-storage-1333 [002] .... 59.899703: do_read: file read 8192 @ 122880 -> 8192 file-storage-1333 [002] .... 59.899704: start_transfer: bulk-in start file-storage-1333 [002] .... 59.899704: net2280_queue: received request file-storage-1333 [002] d... 59.899704: net2280_queue: starting request file-storage-1333 [002] d... 59.899706: start_dma: start file-storage-1333 [002] d... 59.899707: start_dma: start queue file-storage-1333 [002] d... 59.899710: start_dma: end file-storage-1333 [002] d... 59.899710: net2280_queue: finished file-storage-1333 [002] .... 59.899710: start_transfer: bulk-in queued First DMA to first IRQ = 116us. Second DMA to second IRQ = 107us. These seem fairly stable as the last request in a 500+MB transfer had exactly the same timings. Each IRQ is taking around 14us to handle, during which most of the time the ep->dev->lock is locked (only released in the done function whilst calling the callback). Setting up DMA is taking around 5us to handle. Measured end-to-end performance: 135MB/s (145MB/s with tweaked max_sector). That’s not bad given that the DMA is only managing 153MB/s. Yousaf: is there a way to get an IRQ on a stall/fifo full/etc. to see if some throttling is going on? Any other parts of the code that would be useful to have timing on? Are there any tools to measure host to device raw performance (max bulk throughput)? Paul -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html