Hi Sage, how about the filestore_omap_backend? It's set to leveldb by default now. Would it be set to rocksdb also? thanks, -yuan -----Original Message----- From: ceph-devel-owner@xxxxxxxxxxxxxxx [mailto:ceph-devel-owner@xxxxxxxxxxxxxxx] On Behalf Of Sage Weil Sent: Tuesday, May 3, 2016 5:47 AM To: skinjo@xxxxxxxxxx Cc: Wido den Hollander <wido@xxxxxxxx>; Ceph Development <ceph-devel@xxxxxxxxxxxxxxx> Subject: Re: mon switch from leveldb to rocksdb On Tue, 3 May 2016, Shinobu Kinjo wrote: > If possible, it would be much better to make it pluggable so that we > select what we want. Yeah, that is the plan. The mon_keyvaluedb will select leveldb or rocksdb. We'd just switch the default over at some point, once we're satisfied with stability. After thinking about this some more I agree with Wido that the conversion isn't useful enough to bother with. We can just make new mons use rocksdb, and if someone wants to convert, they can add/remove/replace mons in their cluster to get there. sage > > On Tue, May 3, 2016 at 6:25 AM, Wido den Hollander <wido@xxxxxxxx> wrote: > > > >> Op 2 mei 2016 om 20:49 schreef Sage Weil <sweil@xxxxxxxxxx>: > >> > >> > >> We're thinking about switching the default backend on the mon from > >> leveldb to rocksdb. Rocksdb is better maintained, has a stronger > >> feature set, is generally faster, and is linked statically, which > >> means we won't be vulnerable to buggy distro packages. > >> > >> There is one blocker, though. Some distro leveldbs name the sst > >> files with the .ldb suffix. (Some don't; very annoying.) There is > >> a unit test in rocksdb that tries to verify that ldb is silently > >> renamed to sst, and it passes, but the test is incomplete: the test > >> failes to verify that ldb/sst files can actually be read, and it turns out only the 'check' > >> path (not the normal open and read it path) handles ldb properly. > >> > >> Anyway, once that works, rocksdb will magically upgrade from > >> leveldb to rocksdb. Note that once that happens you can't switch > >> from rocksdb back to leveldb without recreating the mon. > >> > >> Alternatively, we could not worry about upgrading existing leveldb > >> instances and just make newly created mons default to rocksdb. > >> > >> 1) Thoughts on moving to rocksdb in general? > >> > >> 2) Importance of leveldb->rocksdb conversion? > >> > > > > I would not touch this auto conversion at first. I know there is things to gain, but is it enough to gain that it might be worth while potentially corrupting monitors? > > > > Is it that LevelDB doesn't handle large cluster load for example? Imho the majority of Ceph clusters is still far below 500 OSDs. > > > > Personally I always try to stay away from touching the MONs datastore. Always feels a bit scary. > > > > Wido > > > >> 3) Anyone want to fix the ldb handling in rocksdb? > >> > >> Thanks! > >> sage > >> > >> -- > >> To unsubscribe from this list: send the line "unsubscribe > >> ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx > >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- > > To unsubscribe from this list: send the line "unsubscribe > > ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > -- > Email: > shinobu@xxxxxxxxx > GitHub: > shinobu-x > Blog: > Life with Distributed Computational System based on OpenSource > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" > in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo > info at http://vger.kernel.org/majordomo-info.html > > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html