Re: Using bluestore in Jewel 10.0.4

"Stillwell, Bryan" <bryan.stillwell@xxxxxxxxxxx> · Mon, 14 Mar 2016 17:00:09 +0000

Mark,

Since most of us already have existing clusters that use SSDs for
journals, has there been any testing of converting that hardware over to
using BlueStore and re-purposing the SSDs as a block cache (like using
bcache)?

To me this seems like it would be a good combination for a typical RBD
cluster.

Thanks,
Bryan

On 3/14/16, 10:52 AM, "ceph-users on behalf of Mark Nelson"
<ceph-users-bounces@xxxxxxxxxxxxxx on behalf of mnelson@xxxxxxxxxx> wrote:

>Hi Folks,
>
>We are actually in the middle of doing some bluestore testing/tuning for
>the upstream jewel release as we speak. :)  These are (so far) pure HDD
>tests using 4 nodes with 4 spinning disks and no SSDs.
>
>Basically on the write side it's looking fantastic and that's an area we
>really wanted to improve so that's great.  On the read side, we are
>working on getting sequential read performance up for certain IO sizes.
>  We are more dependent on client-side readahead with bluestore since
>there is no underlying filesystem below the OSDs helping us out. This
>usually isn't a problem in practice since there should be readahead on
>the VM, but when testing with fio using the RBD engine you should
>probably enable client side RBD readahead:
>
>rbd readahead disable after bytes = 0
>rbd readahead max bytes = 4194304
>
>Again, this probably only matters when directly using librbd.
>
>The other question is using default buffered reads in bluestore, ie
>setting:
>
>"bluestore default buffered read = true"
>
>That's what we are working on testing now.  I've included the ceph.conf
>used for these tests and also a link for some of our recent results.
>Please download it and open it up in libreoffice as google's preview
>isn't showing the graphs.
>
>Here's how the legend is setup:
>
>Hammer-FS: Hammer + Filestore
>6dba7fd-BS (No RBD RA): Master + Fixes + Bluestore
>6dba7fd-BS: (4M RBD RA): Master + Fixes + Bluestore + 4M RBD Read Ahead
>c1e41afb-FS: Master + Filestore + new journal throttling + Sam's tuning
>
>https://drive.google.com/file/d/0B2gTBZrkrnpZMl9OZ18yS3NuZEU/view?usp=shar
>ing
>
>Mark
>
>On 03/14/2016 11:04 AM, Kenneth Waegeman wrote:
>> Hi Stefan,
>>
>> We are also interested in the bluestore, but did not yet look into it.
>>
>> We tried keyvaluestore before and that could be enabled by setting the
>> osd objectstore value.
>> And in this ticket http://tracker.ceph.com/issues/13942 I see:
>>
>> [global]
>>          enable experimental unrecoverable data corrupting features = *
>>          bluestore fsck on mount = true
>>          bluestore block db size = 67108864
>>          bluestore block wal size = 134217728
>>          bluestore block size = 5368709120
>>          osd objectstore = bluestore
>>
>> So I guess this could work for bluestore too.
>>
>> Very curious to hear what you see stability and performance wise :)
>>
>> Cheers,
>> Kenneth
>>
>> On 14/03/16 16:03, Stefan Lissmats wrote:
>>> Hello everyone!
>>>
>>> I think that the new bluestore sounds great and would like to try it
>>> out in my test environment but I didn't find anything how to use it
>>> but I finally managed to test it and it really looks promising
>>> performancewise.
>>> If anyone has more information or guides for bluestore please tell me
>>> where.
>>>
>>> I thought I would share how I managed to get a new Jewel cluster with
>>> bluestore based osd:s to work.
>>>
>>>
>>> What i found so far is that ceph-disk can create new bluestore osd:s
>>> (but not ceph-deploy, plase correct me if i'm wrong) and I need to
>>> have "enable experimental unrecoverable data corrupting features =
>>> bluestore rocksdb" in global section in ceph.conf.
>>> After that I can create new osd:s with ceph-disk prepare --bluestore
>>> /dev/sdg
>>>
>>> So i created a cluster with ceph-deploy without any osd:s and then
>>> used ceph-disk on hosts to create the osd:s.
>>>
>>> Pretty simple in the end but it took me a while to figure that out.
>>>
>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>

________________________________

This E-mail and any of its attachments may contain Time Warner Cable proprietary information, which is privileged, confidential, or subject to copyright belonging to Time Warner Cable. This E-mail is intended solely for the use of the individual or entity to which it is addressed. If you are not the intended recipient of this E-mail, you are hereby notified that any dissemination, distribution, copying, or action taken in relation to the contents of and attachments to this E-mail is strictly prohibited and may be unlawful. If you have received this E-mail in error, please notify the sender immediately and permanently delete the original and any copy of this E-mail and any printout.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com