Re: [Ceph] 10Gb network support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Well, there's obviously something very wrong in your hardware or configuration. Looking at the rados bench results I see a pretty large standard deviation and very high latencies, which leads me to believe that probably something is wrong with your journal. Try running
"ceph -w" in one window and then "ceph OSD tell \* bench" and wait and see what results come in.

Oh -- did you mean to test with not-quite-40MB objects? That's at least 4GB of RAM on your client node so maybe you've just run it out of memory?
-Greg

On Thursday, September 12, 2013, Kuo Hugo wrote:
Hi Gregory, 


For the full command of rados bench like this : 
$> rados bench 100  write -t 100 -p .rgw.buckets --block-size 40485760

The network bandwidth between rados client to OSDs is 10Gb,  192.168.2.51 --> 192.168.2.61 (one of storage node includes 10 OSDs)

[  3] local 192.168.2.51 port 52256 connected with 192.168.2.61 port 5001
[  3]  0.0-10.0 sec  10.7 GBytes  9.19 Gbits/sec 

Would you please to explain more about *lower-level* tests ?  is it something like single disk I/O performance test in your mind?  If so, no.  But it's impossible that all drives are fucked-up tho. 

In my knowledge of the data path of Rados, When I upload an object from the Rados Client (assumes replicas=3).  It should be : 
Rados Client --> MON.0 (get CRUSH map) --> Three of 30 OSDs

Ideally, to upload 10~20 objects simultaneously could filled up all the bandwidth of 10Gb network. 

內置圖片 1


Thanks




+Hugo Kuo+
(+886) 935004793


2013/9/13 Gregory Farnum <greg@xxxxxxxxxxx>
What command did you use to get those results? Have you tried increasing parallelism? What bandwidth do you have between that machine and your OSDs? Have you run lower-level tests on individual disks and nodes to make sure they're performing as you expect?
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


On Thu, Sep 12, 2013 at 7:47 AM, Kuo Hugo <tonytkdk@xxxxxxxxx> wrote:
Hi folks, 

I deployed a Ceph cluster with 10Gb network devices. But the max bandwidth usage only 100MB/sec
Do I need to enable or setup anything for 10Gb support ? 

內置圖片 1


My Rados Bench 
Total time run:         101.265252
Total writes made:      236
Write size:             40485760
Bandwidth (MB/sec):     89.982

Stddev Bandwidth:       376.238
Max bandwidth (MB/sec): 3822.41
Min bandwidth (MB/sec): 0
Average Latency:        33.9225
Stddev Latency:         12.8661
Max latency:            43.6013
Min latency:            1.03948


I check the network bandwidth between nodes by iperf . 

[Iperf]
>From BM to RadosGW  
local 192.168.2.51 port 5001 connected with 192.168.2.40 port 39421
0.0-10.0 sec  10.1 GBytes  8.69 Gbits/sec

From RadosGW to Rados nodes

[  3] local 192.168.2.51 port 52256 connected with 192.168.2.61 port 5001
[  3]  0.0-10.0 sec  10.7 GBytes  9.19 Gbits/sec 

[  3] local 192.168.2.51 port 52256 connected with 192.168.2.62 port 5001
[  3]  0.0-10.0 sec  9.2 GBytes  8.1 Gbits/sec 

[  3] local 192.168.2.51 port 51196 connected with 192.168.2.63 port 5001
[  3]  0.0-10.0 sec  10.7 GBytes  9.21 Gbits/sec


All OSDs are listening on 192.168.2.x 

My OSD dump : 

2013-09-12 07:43:42.556501 7f026a66b780 -1 asok(0x1c9d510) AdminSocketConfigObs


--
Software Engineer #42 @ http://inktank.com | http://ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux