Re: Performance question

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I dunno, I think I just go into my Lotus and mull this over ;) (I wish)

This is a storage for a KVM, and we have quite a few boxes.  While right now none are suffering from IO load, I am seeing slowdown personally and know that sooner or later others will notice as well.  

I think what I will do is remove the SSD from the cluster, and put journals on it.  

On Tue, Nov 24, 2015 at 11:42 AM, Nick Fisk <nick@xxxxxxxxxx> wrote:

Separate would be best, but as with many things in life we are not all driving around in sports cars!!

 

Moving the journals to the SSD’s that are also OSD’s themselves will be fine. SSD’s tend to be more bandwidth limited than IOPs and the reverse is true for Disks, so you will get maybe 2x improvement for the disk pool and you probably won’t even notice the impact on the SSD pool.

 

Can I just ask what your workload will be? There maybe other things that can be done.

 

From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of Marek Dohojda
Sent: 24 November 2015 18:32
To: Alan Johnson <alanj@xxxxxxxxxxxxxx>
Cc: ceph-users@xxxxxxxxxxxxxx; Nick Fisk <nick@xxxxxxxxxx>


Subject: Re: Performance question

 

Thank you! I will do that.  Would you suggest getting another SSD drive or move the journal to the SSD OSD? 

 

(Sorry for a stupid question, if that is such).

 

On Tue, Nov 24, 2015 at 11:25 AM, Alan Johnson <alanj@xxxxxxxxxxxxxx> wrote:

Or separate the journals as this will bring the workload down on the spinners to 3Xrather than 6X

 

From: Marek Dohojda [mailto:mdohojda@xxxxxxxxxxxxxxxxxxx]
Sent: Tuesday, November 24, 2015 1:24 PM
To: Nick Fisk
Cc: Alan Johnson; ceph-users@xxxxxxxxxxxxxx


Subject: Re: Performance question

 

Crad I think you are 100% correct:

 

rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util

 

 0.00   369.00   33.00 1405.00   132.00 135656.00   188.86     5.61    4.02   21.94    3.60   0.70 100.00

 

I was kinda wondering that this maybe the case, which is why I was wondering if I should be doing too much in terms of troubleshooting.

 

So basically what you are saying I need to wait for new version?

 

 

Thank you very much everybody!

 

 

On Tue, Nov 24, 2015 at 9:35 AM, Nick Fisk <nick@xxxxxxxxxx> wrote:

You haven’t stated what size replication you are running. Keep in mind that with a replication factor of 3, you will be writing 6x the amount of data down to disks than what the benchmark says (3x replication x2 for data+journal write).

 

You might actually be near the hardware maximums. What does iostat looks like whilst you are running rados bench, are the disks getting maxed out?

 

From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of Marek Dohojda
Sent: 24 November 2015 16:27
To: Alan Johnson <alanj@xxxxxxxxxxxxxx>


Cc: ceph-users@xxxxxxxxxxxxxx
Subject: Re: Performance question

 

7 total servers, 20 GIG pipe between servers, both reads and writes.  The network itself has plenty of pipe left, it is averaging 40Mbits/s 

 

Rados Bench SAS 30 writes

 Total time run:         30.591927

Total writes made:      386

Write size:             4194304

Bandwidth (MB/sec):     50.471 

 

Stddev Bandwidth:       48.1052

Max bandwidth (MB/sec): 160

Min bandwidth (MB/sec): 0

Average Latency:        1.25908

Stddev Latency:         2.62018

Max latency:            21.2809

Min latency:            0.029227

 

Rados Bench SSD writes

 Total time run:         20.425192

Total writes made:      1405

Write size:             4194304

Bandwidth (MB/sec):     275.150 

 

Stddev Bandwidth:       122.565

Max bandwidth (MB/sec): 576

Min bandwidth (MB/sec): 0

Average Latency:        0.231803

Stddev Latency:         0.190978

Max latency:            0.981022

Min latency:            0.0265421

 

 

As you can see SSD is better but not as much as I would expect SSD to be. 

 

 

 

On Tue, Nov 24, 2015 at 9:10 AM, Alan Johnson <alanj@xxxxxxxxxxxxxx> wrote:

Hard to know without more config details such as no of servers, network  – GigE or !0 GigE, also not sure how you are measuring, (reads or writes) you could try RADOS bench as a baseline, I would expect more performance with 7 X 10K spinners journaled to SSDs. The fact that SSDs did not perform much better may mean to a bottleneck elsewhere – network perhaps?

From: Marek Dohojda [mailto:mdohojda@xxxxxxxxxxxxxxxxxxx]
Sent: Tuesday, November 24, 2015 10:37 AM
To: Alan Johnson
Cc: Haomai Wang; ceph-users@xxxxxxxxxxxxxx


Subject: Re: [ceph-users] Performance question

 

Yeah they are, that is one thing I was planning on changing, What I am really interested at the moment, is vague expected performance.  I mean is 100MB around normal, very low, or "could be better"?

 

On Tue, Nov 24, 2015 at 8:02 AM, Alan Johnson <alanj@xxxxxxxxxxxxxx> wrote:

Are the journals on the same device – it might be better to use the SSDs for journaling since you are not getting better performance with SSDs?

 

From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of Marek Dohojda
Sent: Monday, November 23, 2015 10:24 PM
To: Haomai Wang
Cc: ceph-users@xxxxxxxxxxxxxx
Subject: Re: Performance question

 

 Sorry I should have specified SAS is the 100 MB :) , but to be honest SSD isn't much faster.

 

On Mon, Nov 23, 2015 at 7:38 PM, Haomai Wang <haomaiwang@xxxxxxxxx> wrote:

On Tue, Nov 24, 2015 at 10:35 AM, Marek Dohojda
<mdohojda@xxxxxxxxxxxxxxxxxxx> wrote:
> No SSD and SAS are in two separate pools.
>
> On Mon, Nov 23, 2015 at 7:30 PM, Haomai Wang <haomaiwang@xxxxxxxxx> wrote:
>>
>> On Tue, Nov 24, 2015 at 10:23 AM, Marek Dohojda
>> <mdohojda@xxxxxxxxxxxxxxxxxxx> wrote:
>> > I have a Hammer Ceph cluster on 7 nodes with total 14 OSDs.  7 of which
>> > are
>> > SSD and 7 of which are SAS 10K drives.  I get typically about 100MB IO
>> > rates
>> > on this cluster.

So which pool you get with 100 MB?


>>
>> You mixed up sas and ssd in one pool?
>>
>> >
>> > I have a simple question.  Is 100MB within my configuration what I
>> > should
>> > expect, or should it be higher? I am not sure if I should be looking for
>> > issues, or just accept what I have.
>> >
>> > _______________________________________________
>> > ceph-users mailing list
>> > ceph-users@xxxxxxxxxxxxxx

>> > http://xo4t.mj.am/link/xo4t/rslwlms/1/BMAuqvTZa9PuDgefDPxnDw/aHR0cDovL3hvNHQubWouYW0vbGluay94bzR0L3JzeGppdDEvMS9ObEVxaHVhMnJPSHhtWGRpT0NMX3dBL2FIUjBjRG92TDJ4cGMzUnpMbU5sY0dndVkyOXRMMnhwYzNScGJtWnZMbU5uYVM5alpYQm9MWFZ6WlhKekxXTmxjR2d1WTI5dA
>> >
>>
>>
>>
>> --
>> Best Regards,
>>
>> Wheat
>
>

--
Best Regards,

Wheat

 

 

 


Image removed by sender.

 

 



_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux