RE: mon switch from leveldb to rocksdb

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 3 May 2016, Zhou, Yuan wrote:
> Hi Sage, 
> 
> how about the filestore_omap_backend? It's set to leveldb by default 
> now. Would it be set to rocksdb also?

I'd rather leave FileStore alone since it will eventually be deprecated.  
It's also more sensitive to performance variation and we'd need to be a 
lot more careful making any changes.

sage



> 
> thanks, -yuan
> 
> -----Original Message-----
> From: ceph-devel-owner@xxxxxxxxxxxxxxx [mailto:ceph-devel-owner@xxxxxxxxxxxxxxx] On Behalf Of Sage Weil
> Sent: Tuesday, May 3, 2016 5:47 AM
> To: skinjo@xxxxxxxxxx
> Cc: Wido den Hollander <wido@xxxxxxxx>; Ceph Development <ceph-devel@xxxxxxxxxxxxxxx>
> Subject: Re: mon switch from leveldb to rocksdb
> 
> On Tue, 3 May 2016, Shinobu Kinjo wrote:
> > If possible, it would be much better to make it pluggable so that we 
> > select what we want.
> 
> Yeah, that is the plan.  The mon_keyvaluedb will select leveldb or rocksdb.  We'd just switch the default over at some point, once we're satisfied with stability.
> 
> After thinking about this some more I agree with Wido that the conversion isn't useful enough to bother with.  We can just make new mons use rocksdb, and if someone wants to convert, they can add/remove/replace mons in their cluster to get there.
> 
> sage
> 
> 
> 
> > 
> > On Tue, May 3, 2016 at 6:25 AM, Wido den Hollander <wido@xxxxxxxx> wrote:
> > >
> > >> Op 2 mei 2016 om 20:49 schreef Sage Weil <sweil@xxxxxxxxxx>:
> > >>
> > >>
> > >> We're thinking about switching the default backend on the mon from 
> > >> leveldb to rocksdb.  Rocksdb is better maintained, has a stronger 
> > >> feature set, is generally faster, and is linked statically, which 
> > >> means we won't be vulnerable to buggy distro packages.
> > >>
> > >> There is one blocker, though.  Some distro leveldbs name the sst 
> > >> files with the .ldb suffix.  (Some don't; very annoying.)  There is 
> > >> a unit test in rocksdb that tries to verify that ldb is silently 
> > >> renamed to sst, and it passes, but the test is incomplete: the test 
> > >> failes to verify that ldb/sst files can actually be read, and it turns out only the 'check'
> > >> path (not the normal open and read it path) handles ldb properly.
> > >>
> > >> Anyway, once that works, rocksdb will magically upgrade from 
> > >> leveldb to rocksdb.  Note that once that happens you can't switch 
> > >> from rocksdb back to leveldb without recreating the mon.
> > >>
> > >> Alternatively, we could not worry about upgrading existing leveldb 
> > >> instances and just make newly created mons default to rocksdb.
> > >>
> > >> 1) Thoughts on moving to rocksdb in general?
> > >>
> > >> 2) Importance of leveldb->rocksdb conversion?
> > >>
> > >
> > > I would not touch this auto conversion at first. I know there is things to gain, but is it enough to gain that it might be worth while potentially corrupting monitors?
> > >
> > > Is it that LevelDB doesn't handle large cluster load for example? Imho the majority of Ceph clusters is still far below 500 OSDs.
> > >
> > > Personally I always try to stay away from touching the MONs datastore. Always feels a bit scary.
> > >
> > > Wido
> > >
> > >> 3) Anyone want to fix the ldb handling in rocksdb?
> > >>
> > >> Thanks!
> > >> sage
> > >>
> > >> --
> > >> To unsubscribe from this list: send the line "unsubscribe 
> > >> ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx 
> > >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe 
> > > ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx 
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> > 
> > 
> > --
> > Email:
> > shinobu@xxxxxxxxx
> > GitHub:
> > shinobu-x
> > Blog:
> > Life with Distributed Computational System based on OpenSource
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" 
> > in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo 
> > info at  http://vger.kernel.org/majordomo-info.html
> > 
> > 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux