On Fri, Sep 01, 2023 at 11:03:54AM +0100, Richard W.M. Jones wrote: > On Fri, Sep 01, 2023 at 10:55:50AM +0100, Daniel P. Berrangé wrote: > > On Fri, Sep 01, 2023 at 10:42:16AM +0100, Richard W.M. Jones wrote: > > > On Fri, Sep 01, 2023 at 10:48:14AM +0200, Kevin Wolf wrote: > > > > Am 31.08.2023 um 11:20 hat Richard W.M. Jones geschrieben: > > > > > On Thu, Aug 31, 2023 at 11:05:55AM +0200, Kevin Wolf wrote: > > > > > > [ Cc: qemu-block ] > > > > > > > > > > > > Am 30.08.2023 um 20:26 hat Richard W.M. Jones geschrieben: > > > > > > > On Tue, Aug 29, 2023 at 05:49:24PM -0000, Daniel Alley wrote: > > > > > > > > > The background to this is I've spent far too long trying to optimize > > > > > > > > > the conversion of qcow2 files to raw files. Most existing qcow2 files > > > > > > > > > that you can find online are zlib compressed, including the qcow2 > > > > > > > > > images provided by Fedora. Each cluster in the file is separately > > > > > > > > > compressed as a zlib stream, and qemu uses zlib library functions to > > > > > > > > > decompress them. When downloading and decompressing these files, I > > > > > > > > > measured 40%+ of the total CPU time is doing zlib decompression. > > > > > > > > > > > > > > > > > > [You don't need to tell me how great Zstd is, qcow2 supports this for > > > > > > > > > compression also, but it is not widely used by existing content.] > > > > > > > > > > > > You make it sound like compressing each cluster individually has a big > > > > > > impact. If so, does increasing the cluster size make a difference, too? > > > > > > That could be an change with less compatibility concerns. > > > > > > > > > > The issue we're discussing in the original thread is speed of > > > > > decompression. We noted that using zlib-ng (a not-quite drop-in > > > > > replacement for zlib) improves decompression speed by 40% or more. > > > > > > > > > > Original thread: > > > > > https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx/thread/CDNPJ4SOTRQMYVCDI3ZSY4SP4FYESCWD/ > > > > > zlib-ng proposed change: > > > > > https://src.fedoraproject.org/rpms/zlib-ng/pull-request/3 > > > > > > > > > > Size of the compressed file is also a concern, but wasn't discussed. > > > > > > > > I understand the context and didn't really think about file size at all. > > > > > > > > My question was essentially if decompressing many small blocks (as we do > > > > today) performs significantly different from decompressing fewer larger > > > > blocks (as we would do with a larger cluster size). > > > > > > I did a quick test just by adjusting the cluster size of a qcow2 file: > > > > > > $ virt-builder fedora-36 > > > $ ls -lsh fedora-36.img > > > 1.2G -rw-r--r--. 1 rjones rjones 6.0G Sep 1 09:53 fedora-36.img > > > $ cat fedora-36.img fedora-36.img fedora-36.img fedora-36.img > test.raw > > > $ ls -lsh test.raw > > > 4.7G -rw-r--r--. 1 rjones rjones 24G Sep 1 09:53 test.raw > > > $ qemu-img convert -f raw test.raw -O qcow2 test.qcow2.zlib.4k -c -o compression_type=zlib,cluster_size=4096 > > > > > > (for cluster sizes 4k, 64k, 512k, 2048k, and > > > compression types zlib & zstd) > > > > > > I tested the speed of decompression using: > > > > > > $ hyperfine 'qemu-img convert -W -m 16 -f qcow2 test.qcow2.XXX -O raw test.out' > > > (qemu 8.0.0-4.fc39.x86_64) > > > > > > $ hyperfine 'nbdkit -U - --filter=qcow2dec file test.qcow2.XXX --run '\''nbdcopy --request-size "$uri" test.out'\'' ' > > > (nbdkit-1.35.11-2.fc40.x86_64) > > > > > > Results: > > > > > > Cluster Compression Compressed size Prog Decompression speed > > > > > > 4k zlib 3228811264 qemu 5.921 s ± 0.074 s > > > 4k zstd 3258097664 qemu 5.189 s ± 0.158 s > > > > > > 4k zlib/zstd nbdkit failed, bug!! > > > > > > 64k zlib 3164667904 qemu 3.579 s ± 0.094 s > > > 64k zstd 3132686336 qemu 1.770 s ± 0.060 s > > > > > > 64k zlib 3164667904 nbdkit 1.254 s ± 0.065 s > > > 64k zstd 3132686336 nbdkit 1.315 s ± 0.037 s > > > > > > 512k zlib 3158744576 qemu 4.008 s ± 0.058 s > > > 512k zstd 3032697344 qemu 1.503 s ± 0.072 s > > > > > > 512k zlib 3158744576 nbdkit 1.702 s ± 0.026 s > > > 512k zstd 3032697344 nbdkit 1.593 s ± 0.039 s > > > > > > 2048k zlib 3197569024 qemu 4.327 s ± 0.051 s > > > 2048k zstd 2995143168 qemu 1.465 s ± 0.085 s > > > > > > 2048k zlib 3197569024 nbdkit 3.660 s ± 0.011 s > > > 2048k zstd 2995143168 nbdkit 3.368 s ± 0.057 s > > > > > > No great surprises - very small cluster size is inefficient, but > > > scaling up the cluster size gain performance, and zstd performs better > > > than zlib once the cluster size is sufficiently large. > > > > The default qcow2 cluster size is 64k, which means we've already > > got the vast majority of the perfornmance and file size win. Going > > beyond 64k defaults doesn't seem massively compelling. > > > > zstd does have a small space win over zlib as expected, but again > > nothing so drastic that it seems compelling to change - that win > > will be line noise in the overall bigger picture of image storage > > and download times. > > Yeah, I was a bit surprised by this. I expected zstd files to be > significantly smaller than zlib even though that's not what zstd is > optimized for. Not that they'd be about the same. > > > The major difference here is that zstd is much faster than zlib > > at decompress. I'd be curious if zlib-ng closes that gap ? > > It's quite hard to use zlib-ng in Fedora (currently) since it requires > changes to the source code. That is what the pull request being > discussed would change, as you could simply install zlib-ng-compat > which would replace libz.so. But anyway I can't easily get results > for qemu + zlib-ng, but we expect it would be ~ 40% faster at > decompression, and decompression is what is taking most of the time in > the qemu numbers above. > > I forgot to say that nbdkit is using zlib-ng, since I made the source > level changes a few weeks back (but most of the nbdkit performance > improvement comes from being able to use lots of threads). Ah that last point is interesting. If we look at nbdkit results we can see that while zstd is clearly faster, the margin of the win is massively lower. So I presume we can infer similar margins if qemu-img were switched too. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| _______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue