RE: mon switch from leveldb to rocksdb

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Sage, 

how about the filestore_omap_backend? It's set to leveldb by default now. Would it be set to rocksdb also?

thanks, -yuan

-----Original Message-----
From: ceph-devel-owner@xxxxxxxxxxxxxxx [mailto:ceph-devel-owner@xxxxxxxxxxxxxxx] On Behalf Of Sage Weil
Sent: Tuesday, May 3, 2016 5:47 AM
To: skinjo@xxxxxxxxxx
Cc: Wido den Hollander <wido@xxxxxxxx>; Ceph Development <ceph-devel@xxxxxxxxxxxxxxx>
Subject: Re: mon switch from leveldb to rocksdb

On Tue, 3 May 2016, Shinobu Kinjo wrote:
> If possible, it would be much better to make it pluggable so that we 
> select what we want.

Yeah, that is the plan.  The mon_keyvaluedb will select leveldb or rocksdb.  We'd just switch the default over at some point, once we're satisfied with stability.

After thinking about this some more I agree with Wido that the conversion isn't useful enough to bother with.  We can just make new mons use rocksdb, and if someone wants to convert, they can add/remove/replace mons in their cluster to get there.

sage



> 
> On Tue, May 3, 2016 at 6:25 AM, Wido den Hollander <wido@xxxxxxxx> wrote:
> >
> >> Op 2 mei 2016 om 20:49 schreef Sage Weil <sweil@xxxxxxxxxx>:
> >>
> >>
> >> We're thinking about switching the default backend on the mon from 
> >> leveldb to rocksdb.  Rocksdb is better maintained, has a stronger 
> >> feature set, is generally faster, and is linked statically, which 
> >> means we won't be vulnerable to buggy distro packages.
> >>
> >> There is one blocker, though.  Some distro leveldbs name the sst 
> >> files with the .ldb suffix.  (Some don't; very annoying.)  There is 
> >> a unit test in rocksdb that tries to verify that ldb is silently 
> >> renamed to sst, and it passes, but the test is incomplete: the test 
> >> failes to verify that ldb/sst files can actually be read, and it turns out only the 'check'
> >> path (not the normal open and read it path) handles ldb properly.
> >>
> >> Anyway, once that works, rocksdb will magically upgrade from 
> >> leveldb to rocksdb.  Note that once that happens you can't switch 
> >> from rocksdb back to leveldb without recreating the mon.
> >>
> >> Alternatively, we could not worry about upgrading existing leveldb 
> >> instances and just make newly created mons default to rocksdb.
> >>
> >> 1) Thoughts on moving to rocksdb in general?
> >>
> >> 2) Importance of leveldb->rocksdb conversion?
> >>
> >
> > I would not touch this auto conversion at first. I know there is things to gain, but is it enough to gain that it might be worth while potentially corrupting monitors?
> >
> > Is it that LevelDB doesn't handle large cluster load for example? Imho the majority of Ceph clusters is still far below 500 OSDs.
> >
> > Personally I always try to stay away from touching the MONs datastore. Always feels a bit scary.
> >
> > Wido
> >
> >> 3) Anyone want to fix the ldb handling in rocksdb?
> >>
> >> Thanks!
> >> sage
> >>
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe 
> >> ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx 
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > --
> > To unsubscribe from this list: send the line "unsubscribe 
> > ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx 
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> 
> --
> Email:
> shinobu@xxxxxxxxx
> GitHub:
> shinobu-x
> Blog:
> Life with Distributed Computational System based on OpenSource
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" 
> in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo 
> info at  http://vger.kernel.org/majordomo-info.html
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux