Re: [PATCH v2 2/3] lib: zstd: Don't inline functions in zstd_opt.c

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Nick,

On Wed, Nov 17, 2021 at 9:08 PM Nick Terrell <nickrterrell@xxxxxxxxx> wrote:
> `zstd_opt.c` contains the match finder for the highest compression
> levels. These levels are already very slow, and are unlikely to be used
> in the kernel. If they are used, they shouldn't be used in latency
> sensitive workloads, so slowing them down shouldn't be a big deal.
>
> This saves 188 KB of the 288 KB regression reported by Geert Uytterhoeven [0].
> I've also opened an issue upstream [1] so that we can properly tackle
> the code size issue in `zstd_opt.c` for all users, and can hopefully
> remove this hack in the next zstd version we import.
>
> Bloat-o-meter output on x86-64:
>
> ```
> > ../scripts/bloat-o-meter vmlinux.old vmlinux
> add/remove: 6/5 grow/shrink: 1/9 up/down: 16673/-209939 (-193266)
> Function                                     old     new   delta
> ZSTD_compressBlock_opt_generic.constprop       -    7559   +7559
> ZSTD_insertBtAndGetAllMatches                  -    6304   +6304
> ZSTD_insertBt1                                 -    1731   +1731
> ZSTD_storeSeq                                  -     693    +693
> ZSTD_BtGetAllMatches                           -     255    +255
> ZSTD_updateRep                                 -     128    +128
> ZSTD_updateTree                               96      99      +3
> ZSTD_insertAndFindFirstIndexHash3             81       -     -81
> ZSTD_setBasePrices.constprop                  98       -     -98
> ZSTD_litLengthPrice.constprop                138       -    -138
> ZSTD_count                                   362     181    -181
> ZSTD_count_2segments                        1407     938    -469
> ZSTD_insertBt1.constprop                    2689       -   -2689
> ZSTD_compressBlock_btultra2                19990     423  -19567
> ZSTD_compressBlock_btultra                 19633      15  -19618
> ZSTD_initStats_ultra                       19825       -  -19825
> ZSTD_compressBlock_btopt                   20374      12  -20362
> ZSTD_compressBlock_btopt_extDict           29984      12  -29972
> ZSTD_compressBlock_btultra_extDict         30718      15  -30703
> ZSTD_compressBlock_btopt_dictMatchState    32689      12  -32677
> ZSTD_compressBlock_btultra_dictMatchState   33574      15  -33559
> Total: Before=6611828, After=6418562, chg -2.92%
> ```
>
> [0] https://lkml.org/lkml/2021/11/14/189
> [1] https://github.com/facebook/zstd/issues/2862
>
> Reported-by: Geert Uytterhoeven <geert@xxxxxxxxxxxxxx>
> Signed-off-by: Nick Terrell <terrelln@xxxxxx>

Thanks for your patch!

Impact on lib/zstd/zstd_compress.ko for atari_defconfig:

    add/remove: 5/4 grow/shrink: 1/9 up/down: 15392/-167214 (-151822)

Nice!

Tested-by: Geert Uytterhoeven <geert@xxxxxxxxxxxxxx>

> --- a/lib/zstd/compress/zstd_opt.c
> +++ b/lib/zstd/compress/zstd_opt.c
> @@ -894,7 +906,7 @@ static void ZSTD_optLdm_processMatchCandidate(ZSTD_optLdm_t* optLdm, ZSTD_match_
>               */
>              U32 posOvershoot = currPosInBlock - optLdm->endPosInBlock;
>              ZSTD_optLdm_skipRawSeqStoreBytes(&optLdm->seqStore, posOvershoot);
> -        }
> +        }
>          ZSTD_opt_getNextMatchAndUpdateSeqStore(optLdm, currPosInBlock, remainingBytes);
>      }
>      ZSTD_optLdm_maybeAddMatch(matches, nbMatches, optLdm, currPosInBlock);

This change is unrelated. With that removed:
Reviewed-by: Geert Uytterhoeven <geert@xxxxxxxxxxxxxx>

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds



[Index of Archives]     [Linux SoC]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux