Re: [PATCH] mtd: phram: Map RAM using memremap instead of ioremap

Petr Malat <oss@xxxxxxxxx> · Tue, 7 Jun 2022 12:24:52 +0200

Hi!

On Mon, May 23, 2022 at 04:09:20PM +0000, David Laight wrote:
> On x86 (which I know a lot more about) memcpy() has a nasty
> habit of getting implemented as 'rep movsb' relying on the
> cpu to speed it up.
> But that doesn't happen for uncached addresses - so you get
> very slow byte copies.

I have measured the performance with (patched) and without my
change (orig). My change improves the performance on X8664 and
arm. On Mips64 it stays the same:

Tests
=====
All runtimes are in milliseconds, average real-time of 3 runs, time
measured with bash time built-in. Measured process run in SCHED_FIFO
with priority 99. Page cache was flushed before every run, but all
involved program images were in tmpfs (no swap).
 - dd r512
   dd if=/dev/TESTDEV of=/dev/null  bs=512
 - dd r1MB
   dd if=/dev/TESTDEV of=/dev/null  bs=1M
 - dd r512
   dd of=/dev/TESTDEV if=/tmpfs/img bs=512
 - dd r1MB
   dd of=/dev/TESTDEV if=/tmpfs/img bs=1M
 - flashcp
   flashcp /tmpfs/img /dev/TESTDEV
 - flasherase
   flash_eraseall -q /dev/TESTDEV

Results
=======
All times are in ms

ARCH       |     MIPS64      |       ARM       |     X8664
CPU        |   CN6335p2.2    |    v7 TI K2     |  Xeon D-1548
Dev. size  |      32MB       |      128MB      |     256MB
-----------+-------+---------+-------+---------+-------+---------
     in ms |  Orig | Patched |  Orig | Patched |  Orig | Patched
dd r512    |   131 |     130 |  1101 |     543 | 22906 |     281
dd r1MB    |    65 |      65 |   655 |     122 | 22715 |      70
dd w512    |  1150 |    1150 |  1136 |    1042 | 28067 |     412
dd w1MB    |   104 |     104 |   396 |     244 | 27761 |     122
flashcp    |   100 |      99 |  1438 |     568 | 78455 |     270
flasherase |    21 |      21 |   208 |      77 | 27707 |      57

BR,
  Petr