Re: [PATCH v2 3/3] lib: zstd: Don't add -O3 to cflags

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Nick,

On Wed, Nov 17, 2021 at 9:08 PM Nick Terrell <nickrterrell@xxxxxxxxx> wrote:
> From: Nick Terrell <terrelln@xxxxxx>
>
> After the update to zstd-1.4.10 passing -O3 is no longer necessary to
> get good performance from zstd. Using the default optimization level -O2
> is sufficient to get good performance.
>
> I've measured no significant change to compression speed, and a ~1%
> decompression speed loss, which is acceptable.
>
> This fixes the reported parisc -Wframe-larger-than=1536 errors [0]. The
> gcc-8-hppa-linux-gnu compiler performed very poorly with -O3, generating
> stacks that are ~3KB. With -O2 these same functions generate stacks in
> the < 100B, completely fixing the problem. Function size deltas are
> listed below:
>
> ZSTD_compressBlock_fast_extDict_generic: 3800 -> 68
> ZSTD_compressBlock_fast: 2216 -> 40
> ZSTD_compressBlock_fast_dictMatchState: 1848 ->  64
> ZSTD_compressBlock_doubleFast_extDict_generic: 3744 -> 76
> ZSTD_fillDoubleHashTable: 3252 -> 0
> ZSTD_compressBlock_doubleFast: 5856 -> 36
> ZSTD_compressBlock_doubleFast_dictMatchState: 5380 -> 84
> ZSTD_copmressBlock_lazy2: 2420 -> 72
>
> Additionally, this improves the reported code bloat [1]. With gcc-11
> bloat-o-meter shows an 80KB code size improvement:
>
> ```
> > ../scripts/bloat-o-meter vmlinux.old vmlinux
> add/remove: 31/8 grow/shrink: 24/155 up/down: 25734/-107924 (-82190)
> Total: Before=6418562, After=6336372, chg -1.28%
> ```
>
> Compared to before the zstd-1.4.10 update we see a total code size
> regression of 105KB, down from 374KB at v5.16-rc1:
>
> ```
> > ../scripts/bloat-o-meter vmlinux.old vmlinux
> add/remove: 292/62 grow/shrink: 56/88 up/down: 235009/-127487 (107522)
> Total: Before=6228850, After=6336372, chg +1.73%
> ```
>
> [0] https://lkml.org/lkml/2021/11/15/710
> [1] https://lkml.org/lkml/2021/11/14/189
>
> Reported-by: Geert Uytterhoeven <geert@xxxxxxxxxxxxxx>
> Signed-off-by: Nick Terrell <terrelln@xxxxxx>

Impact on vmlinux for atari_defconfig:

    add/remove: 22/3 grow/shrink: 7/91 up/down: 3246/-35548 (-32302)

Impact on lib/zstd/zstd_compress.ko for atari_defconfig:

    add/remove: 63/5 grow/shrink: 23/197 up/down: 13410/-168604 (-155194)

Tested-by: Geert Uytterhoeven <geert@xxxxxxxxxxxxxx>
Reviewed-by: Geert Uytterhoeven <geert@xxxxxxxxxxxxxx>

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds



[Index of Archives]     [Linux SoC]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux