On Wed, 25 Oct 2017, Stefan Priebe - Profihost AG wrote: > Hello, > > in the lumious release notes is stated that zstd is not supported by > bluestor due to performance reason. I'm wondering why btrfs instead > states that zstd is as fast as lz4 but compresses as good as zlib. > > Why is zlib than supported by bluestor? And why does btrfs / facebook > behave different? > > "BlueStore supports inline compression using zlib, snappy, or LZ4. (Ceph > also supports zstd for RGW compression but zstd is not recommended for > BlueStore for performance reasons.)" zstd will work but in our testing the performance wasn't great for bluestore in particular. The problem was that for each compression run there is a relatively high start-up cost initializing the zstd context/state (IIRC a memset of a huge memory buffer) that dominated the execution time... primarily because bluestore is generally compressing pretty small chunks of data at a time, not big buffers or streams. Take a look at unittest_compression timings on compressing 16KB buffers (smaller than bluestore needs usually, but illustrated of the problem): [ RUN ] Compressor/CompressorTest.compress_16384/0 [plugin zlib (zlib/isal)] [ OK ] Compressor/CompressorTest.compress_16384/0 (294 ms) [ RUN ] Compressor/CompressorTest.compress_16384/1 [plugin zlib (zlib/noisal)] [ OK ] Compressor/CompressorTest.compress_16384/1 (1755 ms) [ RUN ] Compressor/CompressorTest.compress_16384/2 [plugin snappy (snappy)] [ OK ] Compressor/CompressorTest.compress_16384/2 (169 ms) [ RUN ] Compressor/CompressorTest.compress_16384/3 [plugin zstd (zstd)] [ OK ] Compressor/CompressorTest.compress_16384/3 (4528 ms) It's an order of magnitude slower than zlib or snappy, which probably isn't acceptable--even if it is a bit smaller. We just updated to a newer zstd the other day but I haven't been paying attention to the zstd code changes. When I was working on this the plugin was initially also misusing the zstd API, but it was also pointed out that the size of the memset is dependent on the compression level. Maybe a different (default) choice there woudl help. https://github.com/facebook/zstd/issues/408#issuecomment-252163241 sage _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com