Hey Igor, On 27/06/2023 12:06, Igor Fedotov wrote:
I can't say anything about your primary question on zstd benefits/drawbacks but I'd like to emphasize that compression ratio at BlueStore is (to a major degree) determined by the input data flow characteristics (primarily write block size), object store allocation unit size (bluestore_min_alloc_size) and some parameters (e.g. maximum blob size) that determine how input data chunks are logically split when landing on disk. E.g. if one has min_alloc_size set to 4K and write block size is in (4K-8K] then resulting compressed block would never be less than 4K. Hence compression ratio is never more than 2. Similarly if min_alloc_size is 64K there would be no benefit in compression at all for the above input since target allocation units are always larger than input blocks. The rationale of the above behavior is that compression is applied exclusively on input blocks - there is no additional processing to merge input and existing data and compress them all together.
Thanks for the emphasis on input data and its block-size. Yes, that is certainly the most important factor for the compression efficiency and choice of an suitable algorithm for a certain use-case. In my case the pool is RBD only, so (by default) the blocks are 4M if I am not mistaken. I also understand that even though larger blocks generally compress better, I know there is no relation between different blocks in regard to compression dictionaries (going along the lines of de-duplication). In the end in my use-case it boils down to the type of data stored on the RBD images and how compressible that might be. But since those blocks are only written once, and I am ready to invest more CPU cycles to reduce the size on disk.
I am simply looking for data other might have collected on their similar use-cases. Also I am still wondering if there really is nobody that worked/played more with zstd since that has become so popular in recent months...
Regards Christian _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx