Re: Bluestore cluster example

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 22 Apr 2016, Dan van der Ster wrote:
> On Fri, Apr 22, 2016 at 4:09 PM, Sage Weil <sweil@xxxxxxxxxx> wrote:
> > On Fri, 22 Apr 2016, Dan van der Ster wrote:
> >> Hi Mark,
> >>
> >> On Fri, Apr 15, 2016 at 2:06 PM, Mark Nelson <mnelson@xxxxxxxxxx> wrote:
> >> > Hi all,
> >> >
> >> > A couple of folks have asked me how to setup bluestore clusters for
> >> > performance testing.  I personally am using cbt for this, but you should be
> >> > able to use ceph-disk with some other cluster creation method as well.
> >> >
> >> > For CBT, you really don't need to do much.  In the old newstore days, a
> >> > "block" symlink needed to be created in the osd data dir to link to the new
> >> > block device.  CBT did this when the "newstore_block: True" option was set
> >> > in the cluster section of the cbt yaml file.  This isn't really needed
> >> > anymore, as you can now specify the block, db, and wal devices directly in
> >> > your ceph.conf file.  If your partitions are setup properly you can create
> >> > bluestore clusters without having to do anything beyond changing the
> >> > ceph.conf file (with cbt at least).
> >> >
> >> > Here's a very basic example:
> >> >
> >> > [global]
> >> >         enable experimental unrecoverable data corrupting features =
> >> > bluestore rocksdb
> >> >         osd objectstore = bluestore
> >> >
> >> > [osd.0]
> >> >         host = incerta01.front.sepia.ceph.com
> >> >         osd data = /tmp/cbt/mnt/osd-device-0-data
> >> >         bluestore block path = /dev/disk/by-partlabel/osd-device-0-block
> >> >         bluestore block db path = /dev/disk/by-partlabel/osd-device-0-db
> >> >         bluestore block wal path = /dev/disk/by-partlabel/osd-device-0-wal
> >>
> >> The db and wal paths are optional perf boosting things, right?
> >
> > Yeah.
> >
> >> If I do ceph-disk prepare --bluestore /dev/sdae, I get:
> >>
> >> /dev/sdae :
> >>  /dev/sdae2 ceph block, for /dev/sdae1
> >>  /dev/sdae1 ceph data, active, cluster ceph, osd.413, block /dev/sdae2
> >
> > Looks right!  ceph-disk isn't smart enough to set up the wal or db devices
> > yet.
> >
> > The wal device should default to something like 128 MB.
> >
> > The db device would be whatever portion of an SSD you want to allocate to
> > storing the bluestore metadata.  My guess is that we'll get further than
> > we did with FileStore, so the old 1:4 or 1:5 rule of thumb might be more
> > like 1:10, but who knows--this'll require some testing.
> 
> Cool, I'll start testing wal/db on SSDs with some new machines ~next week.
> 
> Just to clarify, if we give no wal, there is no wal, right?
> But if we don't give a dedicated db device, where does the db go?

There are always wal and db "files" from rocksdb, but if there aren't 
separate devices they just end up on the main block device.

> Also, bluestore being experimental in jewel -- I guess that means you
> don't promise to support the current on disk format in kraken and
> beyond?

Correct.

It's *possible* we will include upgrade/conversion code, but it'll sort of 
depend on how much effort it is when it a format change comes along.

The goal is that by kraken it will be stable.

sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux