Earlier on during newstore/bluestore development we tested with the
rocksdb instance (and just the rocksdb WAL) on SSDs. At the time it did
help, but bluestore performance has improved dramatically since then so
we'll need to retest. SSDs shouldn't really help with large writes
anymore (bluestore is already avoiding the journal/wal write penalty for
large writes!). I suspect putting rocksdb on the SSD might still help
with small reads and writes to an extent, especially with large numbers
of objects.
We haven't done any testing with bcache/dm-cache with bluestore yet,
though Ben England has started working on looking at dm-cache with
filestore and I imagine will at some point look at bluestore as well.
It will be interesting to see how things work out!
Mark
On 03/14/2016 12:00 PM, Stillwell, Bryan wrote:
Mark,
Since most of us already have existing clusters that use SSDs for
journals, has there been any testing of converting that hardware over to
using BlueStore and re-purposing the SSDs as a block cache (like using
bcache)?
To me this seems like it would be a good combination for a typical RBD
cluster.
Thanks,
Bryan
On 3/14/16, 10:52 AM, "ceph-users on behalf of Mark Nelson"
<ceph-users-bounces@xxxxxxxxxxxxxx on behalf of mnelson@xxxxxxxxxx> wrote:
Hi Folks,
We are actually in the middle of doing some bluestore testing/tuning for
the upstream jewel release as we speak. :) These are (so far) pure HDD
tests using 4 nodes with 4 spinning disks and no SSDs.
Basically on the write side it's looking fantastic and that's an area we
really wanted to improve so that's great. On the read side, we are
working on getting sequential read performance up for certain IO sizes.
We are more dependent on client-side readahead with bluestore since
there is no underlying filesystem below the OSDs helping us out. This
usually isn't a problem in practice since there should be readahead on
the VM, but when testing with fio using the RBD engine you should
probably enable client side RBD readahead:
rbd readahead disable after bytes = 0
rbd readahead max bytes = 4194304
Again, this probably only matters when directly using librbd.
The other question is using default buffered reads in bluestore, ie
setting:
"bluestore default buffered read = true"
That's what we are working on testing now. I've included the ceph.conf
used for these tests and also a link for some of our recent results.
Please download it and open it up in libreoffice as google's preview
isn't showing the graphs.
Here's how the legend is setup:
Hammer-FS: Hammer + Filestore
6dba7fd-BS (No RBD RA): Master + Fixes + Bluestore
6dba7fd-BS: (4M RBD RA): Master + Fixes + Bluestore + 4M RBD Read Ahead
c1e41afb-FS: Master + Filestore + new journal throttling + Sam's tuning
https://drive.google.com/file/d/0B2gTBZrkrnpZMl9OZ18yS3NuZEU/view?usp=shar
ing
Mark
On 03/14/2016 11:04 AM, Kenneth Waegeman wrote:
Hi Stefan,
We are also interested in the bluestore, but did not yet look into it.
We tried keyvaluestore before and that could be enabled by setting the
osd objectstore value.
And in this ticket http://tracker.ceph.com/issues/13942 I see:
[global]
enable experimental unrecoverable data corrupting features = *
bluestore fsck on mount = true
bluestore block db size = 67108864
bluestore block wal size = 134217728
bluestore block size = 5368709120
osd objectstore = bluestore
So I guess this could work for bluestore too.
Very curious to hear what you see stability and performance wise :)
Cheers,
Kenneth
On 14/03/16 16:03, Stefan Lissmats wrote:
Hello everyone!
I think that the new bluestore sounds great and would like to try it
out in my test environment but I didn't find anything how to use it
but I finally managed to test it and it really looks promising
performancewise.
If anyone has more information or guides for bluestore please tell me
where.
I thought I would share how I managed to get a new Jewel cluster with
bluestore based osd:s to work.
What i found so far is that ceph-disk can create new bluestore osd:s
(but not ceph-deploy, plase correct me if i'm wrong) and I need to
have "enable experimental unrecoverable data corrupting features =
bluestore rocksdb" in global section in ceph.conf.
After that I can create new osd:s with ceph-disk prepare --bluestore
/dev/sdg
So i created a cluster with ceph-deploy without any osd:s and then
used ceph-disk on hosts to create the osd:s.
Pretty simple in the end but it took me a while to figure that out.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
________________________________
This E-mail and any of its attachments may contain Time Warner Cable proprietary information, which is privileged, confidential, or subject to copyright belonging to Time Warner Cable. This E-mail is intended solely for the use of the individual or entity to which it is addressed. If you are not the intended recipient of this E-mail, you are hereby notified that any dissemination, distribution, copying, or action taken in relation to the contents of and attachments to this E-mail is strictly prohibited and may be unlawful. If you have received this E-mail in error, please notify the sender immediately and permanently delete the original and any copy of this E-mail and any printout.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com