Re: Luminous BlueStore EC performance

Blair Bethwaite <blair.bethwaite@xxxxxxxxx> · Wed, 13 Sep 2017 22:40:29 +1000

Thanks for sharing Mohamad.

What size of IOs are these?

The tail latency breakdown is probably a major factor of importance
here too, but I guess you don't have that. Why EC21, I assume that
isn't a config anyone uses in production...? But I suppose it does
facilitate a comparison between replication and EC using PGs of the
same size.

Cheers,

On 13 September 2017 at 02:12, Mohamad Gebai <mgebai@xxxxxxx> wrote:
> Sorry for the delay. We used the default k=2 and m=1.
>
> Mohamad
>
>
> On 09/07/2017 06:22 PM, Christian Wuerdig wrote:
>> What type of EC config (k+m) was used if I may ask?
>>
>> On Fri, Sep 8, 2017 at 1:34 AM, Mohamad Gebai <mgebai@xxxxxxx> wrote:
>>> Hi,
>>>
>>> These numbers are probably not as detailed as you'd like, but it's
>>> something. They show the overhead of reading and/or writing to EC pools as
>>> compared to 3x replicated pools using 1, 2, 8 and 16 threads (single
>>> client):
>>>
>>>      Rep       EC         Diff      Slowdown
>>>      IOPS      IOPS
>>> Read
>>> 1    23,325    22,052     -5.46%    1.06
>>> 2    27,261    27,147     -0.42%    1.00
>>> 8    27,151    27,127     -0.09%    1.00
>>> 16   26,793    26,728     -0.24%    1.00
>>> Write
>>> 1    19,444     5,708    -70.64%    3.41
>>> 2    23,902     5,395    -77.43%    4.43
>>> 8    23,912     5,641    -76.41%    4.24
>>> 16   24,587     5,643    -77.05%    4.36
>>> RW
>>> 1    20,379    11,166    -45.21%    1.83
>>> 2    34,246     9,525    -72.19%    3.60
>>> 8    33,195     9,300    -71.98%    3.57
>>> 16   31,641     9,762    -69.15%    3.24
>>>
>>> This is on an all-SSD cluster, with 3 OSD nodes and Bluestore. Ceph version
>>> 12.1.0-671-g2c11b88d14 (2c11b88d14e64bf60c0556c6a4ec8c9eda36ff6a) luminous
>>> (rc).
>>>
>>> Mohamad
>>>
>>>
>>> On 09/06/2017 01:28 AM, Blair Bethwaite wrote:
>>>
>>> Hi all,
>>>
>>> (Sorry if this shows up twice - I got auto-unsubscribed and so first attempt
>>> was blocked)
>>>
>>> I'm keen to read up on some performance comparisons for replication versus
>>> EC on HDD+SSD based setups. So far the only recent thing I've found is
>>> Sage's Vault17 slides [1], which have a single slide showing 3X / EC42 /
>>> EC51 for Kraken. I guess there is probably some of this data to be found in
>>> the performance meeting threads, but it's hard to know the currency of those
>>> (typically master or wip branch tests) with respect to releases. Can anyone
>>> point out any other references or highlight something that's coming?
>>>
>>> I'm sure there are piles of operators and architects out there at the moment
>>> wondering how they could and should reconfigure their clusters once upgraded
>>> to Luminous. A couple of things going around in my head at the moment:
>>>
>>> * We want to get to having the bulk of our online storage in CephFS on EC
>>> pool/s...
>>> *-- is overwrite performance on EC acceptable for near-line NAS use-cases?
>>> *-- recovery implications (currently recovery on our Jewel RGW EC83 pool is
>>> _way_ slower that 3X pools, what does this do to reliability? maybe split
>>> capacity into multiple pools if it helps to contain failure?)
>>>
>>> [1]
>>> https://www.slideshare.net/sageweil1/bluestore-a-new-storage-backend-for-ceph-one-year-in/37
>>>
>>> --
>>> Cheers,
>>> ~Blairo
>>>
>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>>
>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>

-- 
Cheers,
~Blairo
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com