On Wed, Aug 4, 2021 at 10:40 PM Palmer Dabbelt <palmer@xxxxxxxxxxx> wrote: > > On Tue, 03 Aug 2021 09:54:34 PDT (-0700), mcroce@xxxxxxxxxxxxxxxxxxx wrote: > > On Mon, Jul 19, 2021 at 1:44 PM Matteo Croce <mcroce@xxxxxxxxxxxxxxxxxxx> wrote: > >> > >> From: Matteo Croce <mcroce@xxxxxxxxxxxxx> > >> > >> Use the generic routines which handle alignment properly. > >> > >> These are the performances measured on a BeagleV machine for a > >> 32 mbyte buffer: > >> > >> memcpy: > >> original aligned: 75 Mb/s > >> original unaligned: 75 Mb/s > >> new aligned: 114 Mb/s > >> new unaligned: 107 Mb/s > >> > >> memset: > >> original aligned: 140 Mb/s > >> original unaligned: 140 Mb/s > >> new aligned: 241 Mb/s > >> new unaligned: 241 Mb/s > >> > >> TCP throughput with iperf3 gives a similar improvement as well. > >> > >> This is the binary size increase according to bloat-o-meter: > >> > >> add/remove: 0/0 grow/shrink: 4/2 up/down: 432/-36 (396) > >> Function old new delta > >> memcpy 36 324 +288 > >> memset 32 148 +116 > >> strlcpy 116 132 +16 > >> strscpy_pad 84 96 +12 > >> strlcat 176 164 -12 > >> memmove 76 52 -24 > >> Total: Before=1225371, After=1225767, chg +0.03% > >> > >> Signed-off-by: Matteo Croce <mcroce@xxxxxxxxxxxxx> > >> Signed-off-by: Emil Renner Berthing <kernel@xxxxxxxx> > >> --- > > > > Hi, > > > > can someone have a look at this change and share opinions? > > This LGTM. How are the generic string routines landing? I'm happy to > take this into my for-next, but IIUC we need the optimized generic > versions first so we don't have a performance regression falling back to > the trivial ones for a bit. Is there a shared tag I can pull in? Hi, I see them only in linux-next by now. -- per aspera ad upstream