Re: Bluestore compression - Which algo to choose? Zstd really still that bad?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Christian,

I can't say anything about your primary question on zstd benefits/drawbacks but I'd like to emphasize that compression ratio at BlueStore is (to a major degree) determined by the input data flow characteristics (primarily write block size), object store allocation unit size (bluestore_min_alloc_size) and some parameters (e.g. maximum blob size) that determine how input data chunks are logically split when landing on disk.

E.g. if one has min_alloc_size set to 4K and write block size is in (4K-8K] then resulting compressed block would never be less than 4K. Hence compression ratio is never more than 2.

Similarly if min_alloc_size is 64K there would be no benefit in compression at all for the above input since target allocation units are always larger than input blocks.

The rationale of the above behavior is that compression is applied exclusively on input blocks - there is no additional processing to merge input and existing data and compress them all together.


Thanks,

Igor


On 26/06/2023 11:48, Christian Rohmann wrote:
Hey ceph-users,

we've been using the default "snappy" to have Ceph compress data on certain pools - namely backups / copies of volumes of a VM environment.
So it's write once, and no random access.
I am now wondering if switching to another algo (there is snappy, zlib, lz4, or zstd) would improve the compression ratio (significantly)?

* Does anybody have any real world data on snappy vs. $anyother?

Using zstd is tempting as it's used in various other applications (btrfs, MongoDB, ...) for inline-compression with great success. For Ceph though there is a warning ([1]), about it being not recommended in the docs still. But I am wondering if this still stands with e.g. [2] merged. And there was [3] trying to improve the performance, this this reads as it only lead to a dead-end and no code changes?


In any case does anybody have any numbers to help with the decision on the compression algo?



Regards


Christian


[1] https://docs.ceph.com/en/latest/rados/configuration/bluestore-config-ref/#confval-bluestore_compression_algorithm
[2] https://github.com/ceph/ceph/pull/33790
[3] https://github.com/facebook/zstd/issues/910
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux