I dont have the most experience with ceph as my use case is homelab and only a few months in. I enabled compression on my vm(proxmox hosts) disk rdb pool using mode = aggressive, algorithm = lz4, all other compression settings default. After copying all of the vm disks to another storage and then back to ceph pool I saw a near %50 reduction in space needed for vm disks. I have not had a chance to benchmark the vm disks with compression yet as I am waiting for the cluster to calm down from some other disks moves. On Tue, Jun 27, 2023 at 7:01 AM Christian Rohmann < christian.rohmann@xxxxxxxxx> wrote: > Hey Igor, > > On 27/06/2023 12:06, Igor Fedotov wrote: > > I can't say anything about your primary question on zstd > > benefits/drawbacks but I'd like to emphasize that compression ratio at > > BlueStore is (to a major degree) determined by the input data flow > > characteristics (primarily write block size), object store allocation > > unit size (bluestore_min_alloc_size) and some parameters (e.g. maximum > > blob size) that determine how input data chunks are logically split > > when landing on disk. > > E.g. if one has min_alloc_size set to 4K and write block size is in > > (4K-8K] then resulting compressed block would never be less than 4K. > > Hence compression ratio is never more than 2. > > Similarly if min_alloc_size is 64K there would be no benefit in > > compression at all for the above input since target allocation units > > are always larger than input blocks. > > The rationale of the above behavior is that compression is applied > > exclusively on input blocks - there is no additional processing to > > merge input and existing data and compress them all together. > > > Thanks for the emphasis on input data and its block-size. Yes, that is > certainly the most important factor for the compression efficiency and > choice of an suitable algorithm for a certain use-case. > In my case the pool is RBD only, so (by default) the blocks are 4M if I > am not mistaken. I also understand that even though larger blocks > generally compress better, I know there is no relation between > different blocks in regard to compression dictionaries (going along the > lines of de-duplication). In the end in my use-case it boils down to the > type of data stored on the RBD images and how compressible that might be. > But since those blocks are only written once, and I am ready to invest > more CPU cycles to reduce the size on disk. > > I am simply looking for data other might have collected on their similar > use-cases. > Also I am still wondering if there really is nobody that worked/played > more with zstd since that has become so popular in recent months... > > > Regards > > > Christian > > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > -- Zach Underwood (RHCE,RHCSA,RHCT,UACA) My website <http://zachunderwood.me> advance-networking.com _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx