In days of yore (Thu, 21 Mar 2024), Stephen Smoogen thus quoth: > On Wed, 20 Mar 2024 at 22:01, Kevin Kofler via devel < > devel@xxxxxxxxxxxxxxxxxxxxxxx> wrote: > > > Aoife Moloney wrote: > > > The zstd compression type was chosen to match createrepo_c settings. > > > As an alternative, we might want to choose xz, > > > > Since xz consistently compresses better than zstd, I would strongly > > suggest > > using xz everywhere to minimize download sizes. However: > > > > > especially after zlib-ng has been made the default in Fedora and brought > > > performance improvements. > > > > zlib-ng is for gz, not xz, and gz is fast, but compresses extremely poorly > > (which is mostly due to the format, so, while some implementations manage > > to > > do better than others at the expense of more compression time, there is a > > limit to how well they can do and it is nowhere near xz or even zstd) and > > should hence never be used at all. > > > > > There are two parts to this which users will see as 'slowness'. Part one is > downloading the data from a mirror. Part two is uncompressing the data. In > work I have been a part of, we have found that while xz gave us much > smaller files, the time to uncompress was so much larger that our download > gains were lost. Using zstd gave larger downloads (maybe 10 to 20% bigger) > but uncompressed much faster than xz. This is data dependent though so it > would be good to see if someone could test to see if xz uncompression of > the datafiles will be too slow. Hi there, Ran tests with gzip 1-9 and xz 1-9 on a F41 XML file that was 940MiB. Input File: f41-filelist.xml, Size: 985194446 bytes XZ level 1 : 21s to compress, 5.3% filesize, 4.4s to decompress XZ level 2 : 28s to compress, 5.1% filesize, 4.2s to decompress XZ level 3 : 44s to compress, 5.1% filesize, 4.2s to decompress XZ level 4 : 55s to compress, 5.3% filesize, 4.5s to decompress XZ level 5 : 1min25s to compress, 5.3% filesize, 4.3s to decompress XZ level 6 : 2min49s to compress, 5.1% filesize, 4.4s to decompress XZ level 7 : 2min55s to compress, 4.8% filesize, 4.2s to decompress XZ level 8 : 3min 4s to compress, 4.8% filesize, 4.2s to decompress XZ level 9 : 3min12s to compress, 4.8% filesize, 4.2s to decompress Input File: f41-filelist.xml, Size: 985194446 bytes GZ Level 1 : 6s to compress, 7.9% filesize, 4.2s to decompress GZ Level 2 : 6s to compress, 7.8% filesize, 4.1s to decompress GZ Level 3 : 7s to compress, 7.6% filesize, 4.1s to decompress GZ Level 4 : 8s to compress, 6.8% filesize, 4.0s to decompress GZ Level 5 : 9s to compress, 6.6% filesize, 4.0s to decompress GZ Level 6 : 12s to compress, 6.6% filesize, 4.0s to decompress GZ Level 7 : 15s to compress, 6.5% filesize, 4.0s to decompress GZ Level 8 : 24s to compress, 6.4% filesize, 4.0s to decompress GZ Level 9 : 28s to compress, 6.3% filesize, 4.0s to decompress xz level 2 is not a shabby compromise as you get small filesize and time to compress is the same as gzip level 9. To get the smallest filesizes, the time (and memory requirements) of xz becomes very noticeable for not much gain. #!/bin/bash INPUTFILE=f41-filelist.xml INPUTFILESIZE=$(ls -ln f41-filelist.xml|awk '{print $5}') ## gzip function do_gzip() { let cl=1 echo Input File: ${INPUTFILE}, Size: ${INPUTFILESIZE} bytes echo while [[ $cl -le 9 ]] do echo GZip compression level ${cl} echo Time to compress the file time gzip -k -${cl} ${INPUTFILE} COMPRESSED_SIZE=$(ls -ln ${INPUTFILE}.gz | awk '{print $5}') echo Compressed to echo "scale=5 ${COMPRESSED_SIZE}/${INPUTFILESIZE}*100 "|bc echo % of original echo Time to decompress the file, output to /dev/null time gzip -d -c ${INPUTFILE}.gz > /dev/null rm -f ${INPUTFILE}.gz let cl=$cl+1 echo done } ## xz function do_xz() { let cl=1 echo Input File: ${INPUTFILE}, Size: ${INPUTFILESIZE} bytes echo while [[ $cl -le 9 ]] do echo XZ compression level ${cl} echo Time to compress the file time xz -k -z -${cl} ${INPUTFILE} COMPRESSED_SIZE=$(ls -ln ${INPUTFILE}.xz | awk '{print $5}') echo Compressed to echo "scale=5 ${COMPRESSED_SIZE}/${INPUTFILESIZE}*100 "|bc echo % of original echo Time to decompress the file, output to /dev/null time xz -d -c ${INPUTFILE}.xz > /dev/null rm -f ${INPUTFILE}.xz let cl=$cl+1 echo done } do_gzip do_xz -- Kind regards, /S -- _______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue