The patch titled Subject: xz: use 128 MiB dictionary and force single-threaded mode has been added to the -mm mm-nonmm-unstable branch. Its filename is xz-use-128-mib-dictionary-and-force-single-threaded-mode.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/xz-use-128-mib-dictionary-and-force-single-threaded-mode.patch This patch will later appear in the mm-nonmm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Lasse Collin <lasse.collin@xxxxxxxxxxx> Subject: xz: use 128 MiB dictionary and force single-threaded mode Date: Sun, 21 Jul 2024 16:36:28 +0300 This only affects kernel image compression, not any other xz usage. Desktop kernels on x86-64 are already around 60 MiB. Using a dictionary larger than 32 MiB should have no downsides nowadays as anyone building the kernel should have plenty of RAM. 128 MiB dictionary needs 1346 MiB of RAM with xz versions 5.0.x - 5.6.x in single-threaded mode. On archs that use xz_wrap.sh, kernel decompression is done in single-call mode so a larger dictionary doesn't affect boot-time memory requirements. xz >= 5.6.0 uses multithreaded mode by default which compresses slightly worse than single-threaded mode. Kernel compression rarely used more than one thread anyway because with 32 MiB dictionary size the default block size was 96 MiB in multithreaded mode. So only a single thread was used anyway unless the kernel was over 96 MiB. Comparison to CONFIG_KERNEL_LZMA: It uses "lzma -9" which mapped to 32 MiB dictionary in LZMA Utils 4.32.7 (the final release in 2008). Nowadays the lzma tool on most systems is from XZ Utils where -9 maps to 64 MiB dictionary. So using a 32 MiB dictionary with CONFIG_KERNEL_XZ may have compressed big kernels slightly worse than the old LZMA option. Comparison to CONFIG_KERNEL_ZSTD: zstd uses 128 MiB dictionary. Link: https://lkml.kernel.org/r/20240721133633.47721-14-lasse.collin@xxxxxxxxxxx Signed-off-by: Lasse Collin <lasse.collin@xxxxxxxxxxx> Reviewed-by: Sam James <sam@xxxxxxxxxx> Cc: Albert Ou <aou@xxxxxxxxxxxxxxxxx> Cc: Catalin Marinas <catalin.marinas@xxxxxxx> Cc: Emil Renner Berthing <emil.renner.berthing@xxxxxxxxxxxxx> Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> Cc: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx> Cc: Joel Stanley <joel@xxxxxxxxx> Cc: Jonathan Corbet <corbet@xxxxxxx> Cc: Jubin Zhong <zhongjubin@xxxxxxxxxx> Cc: Jules Maselbas <jmaselbas@xxxxxxxx> Cc: Krzysztof Kozlowski <krzk@xxxxxxxxxx> Cc: Michael Ellerman <mpe@xxxxxxxxxxxxxx> Cc: Palmer Dabbelt <palmer@xxxxxxxxxxx> Cc: Paul Walmsley <paul.walmsley@xxxxxxxxxx> Cc: Randy Dunlap <rdunlap@xxxxxxxxxxxxx> Cc: Rui Li <me@xxxxxxxxx> Cc: Simon Glass <sjg@xxxxxxxxxxxx> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> Cc: Will Deacon <will@xxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- scripts/xz_wrap.sh | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) --- a/scripts/xz_wrap.sh~xz-use-128-mib-dictionary-and-force-single-threaded-mode +++ a/scripts/xz_wrap.sh @@ -16,4 +16,15 @@ case $SRCARCH in sparc) BCJ=--sparc ;; esac -exec $XZ --check=crc32 $BCJ --lzma2=$LZMA2OPTS,dict=32MiB +# Use single-threaded mode because it compresses a little better +# (and uses less RAM) than multithreaded mode. +# +# For the best compression, the dictionary size shouldn't be +# smaller than the uncompressed kernel. 128 MiB dictionary +# needs less than 1400 MiB of RAM in single-threaded mode. +# +# On the archs that use this script to compress the kernel, +# decompression in the preboot code is done in single-call mode. +# Thus the dictionary size doesn't affect the memory requirements +# of the preboot decompressor at all. +exec $XZ --check=crc32 --threads=1 $BCJ --lzma2=$LZMA2OPTS,dict=128MiB _ Patches currently in -mm which might be from lasse.collin@xxxxxxxxxxx are maintainers-add-xz-embedded-maintainer.patch licenses-add-0bsd-license-text.patch xz-switch-from-public-domain-to-bsd-zero-clause-license-0bsd.patch xz-fix-comments-and-coding-style.patch xz-fix-kernel-doc-formatting-errors-in-xzh.patch xz-improve-the-microlzma-kernel-doc-in-xzh.patch xz-documentation-staging-xzrst-revise-thoroughly.patch docs-add-xz_extern-to-c_id_attributes.patch xz-cleanup-crc32-edits-from-2018.patch xz-optimize-for-loop-conditions-in-the-bcj-decoders.patch xz-add-arm64-bcj-filter.patch xz-add-risc-v-bcj-filter.patch xz-use-128-mib-dictionary-and-force-single-threaded-mode.patch xz-adjust-arch-specific-options-for-better-kernel-compression.patch arm64-boot-add-imagexz-support.patch riscv-boot-add-imagexz-support.patch