On Thu, 26 Oct 2017, Stefan Priebe - Profihost AG wrote: > Hi Sage, > > Am 25.10.2017 um 21:54 schrieb Sage Weil: > > On Wed, 25 Oct 2017, Stefan Priebe - Profihost AG wrote: > >> Hello, > >> > >> in the lumious release notes is stated that zstd is not supported by > >> bluestor due to performance reason. I'm wondering why btrfs instead > >> states that zstd is as fast as lz4 but compresses as good as zlib. > >> > >> Why is zlib than supported by bluestor? And why does btrfs / facebook > >> behave different? > >> > >> "BlueStore supports inline compression using zlib, snappy, or LZ4. (Ceph > >> also supports zstd for RGW compression but zstd is not recommended for > >> BlueStore for performance reasons.)" > > > > zstd will work but in our testing the performance wasn't great for > > bluestore in particular. The problem was that for each compression run > > there is a relatively high start-up cost initializing the zstd > > context/state (IIRC a memset of a huge memory buffer) that dominated the > > execution time... primarily because bluestore is generally compressing > > pretty small chunks of data at a time, not big buffers or streams. > > > > Take a look at unittest_compression timings on compressing 16KB buffers > > (smaller than bluestore needs usually, but illustrated of the problem): > > > > [ RUN ] Compressor/CompressorTest.compress_16384/0 > > [plugin zlib (zlib/isal)] > > [ OK ] Compressor/CompressorTest.compress_16384/0 (294 ms) > > [ RUN ] Compressor/CompressorTest.compress_16384/1 > > [plugin zlib (zlib/noisal)] > > [ OK ] Compressor/CompressorTest.compress_16384/1 (1755 ms) > > [ RUN ] Compressor/CompressorTest.compress_16384/2 > > [plugin snappy (snappy)] > > [ OK ] Compressor/CompressorTest.compress_16384/2 (169 ms) > > [ RUN ] Compressor/CompressorTest.compress_16384/3 > > [plugin zstd (zstd)] > > [ OK ] Compressor/CompressorTest.compress_16384/3 (4528 ms) > > > > It's an order of magnitude slower than zlib or snappy, which probably > > isn't acceptable--even if it is a bit smaller. > > > > We just updated to a newer zstd the other day but I haven't been paying > > attention to the zstd code changes. When I was working on this the plugin > > was initially also misusing the zstd API, but it was also pointed out > > that the size of the memset is dependent on the compression level. > > Maybe a different (default) choice there woudl help. > > > > https://github.com/facebook/zstd/issues/408#issuecomment-252163241 > > thanks for the fast reply. Btrfs uses a default compression level of 3 > but i think this is the default anyway. > > Does the zstd plugin of ceph already uses the mentioned > ZSTD_resetCStream instead of creating and initializing a new one every time? Hmm, it doesn't: https://github.com/ceph/ceph/blob/master/src/compressor/zstd/ZstdCompressor.h#L29 but perhaps that was because it didn't make a difference? Might be worth revisiting. > So if performance matters ceph would recommand snappy? Yep! sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html