On Fri, Sep 01, 2023 at 10:55:50AM +0100, Daniel P. Berrangé wrote: > On Fri, Sep 01, 2023 at 10:42:16AM +0100, Richard W.M. Jones wrote: > > On Fri, Sep 01, 2023 at 10:48:14AM +0200, Kevin Wolf wrote: > > > Am 31.08.2023 um 11:20 hat Richard W.M. Jones geschrieben: > > > > On Thu, Aug 31, 2023 at 11:05:55AM +0200, Kevin Wolf wrote: > > > > > [ Cc: qemu-block ] > > > > > > > > > > Am 30.08.2023 um 20:26 hat Richard W.M. Jones geschrieben: > > > > > > On Tue, Aug 29, 2023 at 05:49:24PM -0000, Daniel Alley wrote: > > > > > > > > The background to this is I've spent far too long trying to optimize > > > > > > > > the conversion of qcow2 files to raw files. Most existing qcow2 files > > > > > > > > that you can find online are zlib compressed, including the qcow2 > > > > > > > > images provided by Fedora. Each cluster in the file is separately > > > > > > > > compressed as a zlib stream, and qemu uses zlib library functions to > > > > > > > > decompress them. When downloading and decompressing these files, I > > > > > > > > measured 40%+ of the total CPU time is doing zlib decompression. > > > > > > > > > > > > > > > > [You don't need to tell me how great Zstd is, qcow2 supports this for > > > > > > > > compression also, but it is not widely used by existing content.] > > > > > > > > > > You make it sound like compressing each cluster individually has a big > > > > > impact. If so, does increasing the cluster size make a difference, too? > > > > > That could be an change with less compatibility concerns. > > > > > > > > The issue we're discussing in the original thread is speed of > > > > decompression. We noted that using zlib-ng (a not-quite drop-in > > > > replacement for zlib) improves decompression speed by 40% or more. > > > > > > > > Original thread: > > > > https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx/thread/CDNPJ4SOTRQMYVCDI3ZSY4SP4FYESCWD/ > > > > zlib-ng proposed change: > > > > https://src.fedoraproject.org/rpms/zlib-ng/pull-request/3 > > > > > > > > Size of the compressed file is also a concern, but wasn't discussed. > > > > > > I understand the context and didn't really think about file size at all. > > > > > > My question was essentially if decompressing many small blocks (as we do > > > today) performs significantly different from decompressing fewer larger > > > blocks (as we would do with a larger cluster size). > > > > I did a quick test just by adjusting the cluster size of a qcow2 file: > > > > $ virt-builder fedora-36 > > $ ls -lsh fedora-36.img > > 1.2G -rw-r--r--. 1 rjones rjones 6.0G Sep 1 09:53 fedora-36.img > > $ cat fedora-36.img fedora-36.img fedora-36.img fedora-36.img > test.raw > > $ ls -lsh test.raw > > 4.7G -rw-r--r--. 1 rjones rjones 24G Sep 1 09:53 test.raw > > $ qemu-img convert -f raw test.raw -O qcow2 test.qcow2.zlib.4k -c -o compression_type=zlib,cluster_size=4096 > > > > (for cluster sizes 4k, 64k, 512k, 2048k, and > > compression types zlib & zstd) > > > > I tested the speed of decompression using: > > > > $ hyperfine 'qemu-img convert -W -m 16 -f qcow2 test.qcow2.XXX -O raw test.out' > > (qemu 8.0.0-4.fc39.x86_64) > > > > $ hyperfine 'nbdkit -U - --filter=qcow2dec file test.qcow2.XXX --run '\''nbdcopy --request-size "$uri" test.out'\'' ' > > (nbdkit-1.35.11-2.fc40.x86_64) > > > > Results: > > > > Cluster Compression Compressed size Prog Decompression speed > > > > 4k zlib 3228811264 qemu 5.921 s ± 0.074 s > > 4k zstd 3258097664 qemu 5.189 s ± 0.158 s > > > > 4k zlib/zstd nbdkit failed, bug!! > > > > 64k zlib 3164667904 qemu 3.579 s ± 0.094 s > > 64k zstd 3132686336 qemu 1.770 s ± 0.060 s > > > > 64k zlib 3164667904 nbdkit 1.254 s ± 0.065 s > > 64k zstd 3132686336 nbdkit 1.315 s ± 0.037 s > > > > 512k zlib 3158744576 qemu 4.008 s ± 0.058 s > > 512k zstd 3032697344 qemu 1.503 s ± 0.072 s > > > > 512k zlib 3158744576 nbdkit 1.702 s ± 0.026 s > > 512k zstd 3032697344 nbdkit 1.593 s ± 0.039 s > > > > 2048k zlib 3197569024 qemu 4.327 s ± 0.051 s > > 2048k zstd 2995143168 qemu 1.465 s ± 0.085 s > > > > 2048k zlib 3197569024 nbdkit 3.660 s ± 0.011 s > > 2048k zstd 2995143168 nbdkit 3.368 s ± 0.057 s > > > > No great surprises - very small cluster size is inefficient, but > > scaling up the cluster size gain performance, and zstd performs better > > than zlib once the cluster size is sufficiently large. > > The default qcow2 cluster size is 64k, which means we've already > got the vast majority of the perfornmance and file size win. Going > beyond 64k defaults doesn't seem massively compelling. > > zstd does have a small space win over zlib as expected, but again > nothing so drastic that it seems compelling to change - that win > will be line noise in the overall bigger picture of image storage > and download times. Yeah, I was a bit surprised by this. I expected zstd files to be significantly smaller than zlib even though that's not what zstd is optimized for. Not that they'd be about the same. > The major difference here is that zstd is much faster than zlib > at decompress. I'd be curious if zlib-ng closes that gap ? It's quite hard to use zlib-ng in Fedora (currently) since it requires changes to the source code. That is what the pull request being discussed would change, as you could simply install zlib-ng-compat which would replace libz.so. But anyway I can't easily get results for qemu + zlib-ng, but we expect it would be ~ 40% faster at decompression, and decompression is what is taking most of the time in the qemu numbers above. I forgot to say that nbdkit is using zlib-ng, since I made the source level changes a few weeks back (but most of the nbdkit performance improvement comes from being able to use lots of threads). > If it does, then for the sake of image portability it'd be better > to stick with zlib compression in qcow2 and leverage zlib-ng for > speed, and ignore zstd. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com nbdkit - Flexible, fast NBD server with plugins https://gitlab.com/nbdkit/nbdkit _______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue