On Tue, May 3, 2016 at 6:34 AM, Mark Nelson <mnelson@xxxxxxxxxx> wrote: > On 05/02/2016 02:00 PM, Howard Chu wrote: >> >> Sage Weil wrote: >>> >>> 1) Thoughts on moving to rocksdb in general? >> >> >> Are you actually prepared to undertake all of the measurement and tuning >> required to make RocksDB actually work well? You're switching from an >> (abandoned/unsupported) engine with only a handful of config parameters >> to one with ~40-50 params, all of which have critical but unpredictable >> impact on resource consumption and performance. >> > > You are absolutely correct, and there are definitely pitfalls we need to > watch out for with the number of tunables in rocksdb. At least on the > performance side two of the big issues we've hit with leveldb compaction > related. In some scenarios compaction happens slower than the number of > writes coming in resulting in ever-growing db sizes. The other issue is > that compaction is single threaded and this can cause stalls and general > mayhem when things get really heavily loaded. My hope is that if we do go > with rocksdb, even in a sub-optimally tuned state, we'll be better off than > we were with leveldb. > > We did some very preliminary benchmarks a couple of years ago (admittedly a > too-small dataset size) basically comparing the (at the time) stock ceph > leveldb settings vs rocksdb. On this set size, leveldb looked much better > for reads, but much worse for writes. That's actually a bit troubling — many of our monitor problems have arisen from slow reads, rather than slow writes. I suspect we want to eliminate this before switching, if it's a concern. ...Although I think I did see a monitor caching layer go by, so maybe it's a moot point now? -Greg -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html