On Tue, 2018-09-11 at 21:58 +0300, Matwey V. Kornilov wrote: > вт, 28 авг. 2018 г. в 10:17, Matwey V. Kornilov <matwey.kornilov@xxxxxxxxx>: > > > > вт, 21 авг. 2018 г. в 20:06, Matwey V. Kornilov <matwey@xxxxxxxxxx>: > > > > > > DMA cocherency slows the transfer down on systems without hardware > > > coherent DMA. > > > Instead we use noncocherent DMA memory and explicit sync at data receive > > > handler. > > > > > > Based on previous commit the following performance benchmarks have been > > > carried out. Average memcpy() data transfer rate (rate) and handler > > > completion time (time) have been measured when running video stream at > > > 640x480 resolution at 10fps. > > > > > > x86_64 based system (Intel Core i5-3470). This platform has hardware > > > coherent DMA support and proposed change doesn't make big difference here. > > > > > > * kmalloc: rate = (2.0 +- 0.4) GBps > > > time = (5.0 +- 3.0) usec > > > * usb_alloc_coherent: rate = (3.4 +- 1.2) GBps > > > time = (3.5 +- 3.0) usec > > > > > > We see that the measurements agree within error ranges in this case. > > > So theoretically predicted performance downgrade cannot be reliably > > > measured here. > > > > > > armv7l based system (TI AM335x BeagleBone Black @ 300MHz). This platform > > > has no hardware coherent DMA support. DMA coherence is implemented via > > > disabled page caching that slows down memcpy() due to memory controller > > > behaviour. > > > > > > * kmalloc: rate = (114 +- 5) MBps > > > time = (84 +- 4) usec > > > * usb_alloc_coherent: rate = (28.1 +- 0.1) MBps > > > time = (341 +- 2) usec > > > > > > Note, that quantative difference leads (this commit leads to 4 times > > > acceleration) to qualitative behavior change in this case. As it was > > > stated before, the video stream cannot be successfully received at AM335x > > > platforms with MUSB based USB host controller due to performance issues > > > [1]. > > > > > > [1] https://www.spinics.net/lists/linux-usb/msg165735.html > > > > > > Signed-off-by: Matwey V. Kornilov <matwey@xxxxxxxxxx> > > > > Ping > > We are already using slab buffers in em28xx, so I think this change makes sense, until we have proper a non-coherent allocation API. Acked-by: Ezequiel Garcia <ezequiel@xxxxxxxxxxxxx> Thanks a lot Matwey for your work.