Re: [PATCH] riscv: use the generic string routines

Matteo Croce <mcroce@xxxxxxxxxxxxxxxxxxx> · Thu, 5 Aug 2021 12:31:04 +0200

On Wed, Aug 4, 2021 at 10:40 PM Palmer Dabbelt <palmer@xxxxxxxxxxx> wrote:
>
> On Tue, 03 Aug 2021 09:54:34 PDT (-0700), mcroce@xxxxxxxxxxxxxxxxxxx wrote:
> > On Mon, Jul 19, 2021 at 1:44 PM Matteo Croce <mcroce@xxxxxxxxxxxxxxxxxxx> wrote:
> >>
> >> From: Matteo Croce <mcroce@xxxxxxxxxxxxx>
> >>
> >> Use the generic routines which handle alignment properly.
> >>
> >> These are the performances measured on a BeagleV machine for a
> >> 32 mbyte buffer:
> >>
> >> memcpy:
> >> original aligned:        75 Mb/s
> >> original unaligned:      75 Mb/s
> >> new aligned:            114 Mb/s
> >> new unaligned:          107 Mb/s
> >>
> >> memset:
> >> original aligned:       140 Mb/s
> >> original unaligned:     140 Mb/s
> >> new aligned:            241 Mb/s
> >> new unaligned:          241 Mb/s
> >>
> >> TCP throughput with iperf3 gives a similar improvement as well.
> >>
> >> This is the binary size increase according to bloat-o-meter:
> >>
> >> add/remove: 0/0 grow/shrink: 4/2 up/down: 432/-36 (396)
> >> Function                                     old     new   delta
> >> memcpy                                        36     324    +288
> >> memset                                        32     148    +116
> >> strlcpy                                      116     132     +16
> >> strscpy_pad                                   84      96     +12
> >> strlcat                                      176     164     -12
> >> memmove                                       76      52     -24
> >> Total: Before=1225371, After=1225767, chg +0.03%
> >>
> >> Signed-off-by: Matteo Croce <mcroce@xxxxxxxxxxxxx>
> >> Signed-off-by: Emil Renner Berthing <kernel@xxxxxxxx>
> >> ---
> >
> > Hi,
> >
> > can someone have a look at this change and share opinions?
>
> This LGTM.  How are the generic string routines landing?  I'm happy to
> take this into my for-next, but IIUC we need the optimized generic
> versions first so we don't have a performance regression falling back to
> the trivial ones for a bit.  Is there a shared tag I can pull in?

Hi,

I see them only in linux-next by now.

-- 
per aspera ad upstream