Low speed of write to cephfs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello all,
Does anybody try to use cephfs?

I have two servers with RHEL7.1(latest kernel 3.10.0-229.14.1.el7.x86_64). Each server has 15G flash for ceph journal and 12*2Tb SATA disk for data.
I have Infiniband(ipoib) 56Gb/s interconnect between nodes.


Cluster version
# ceph -v
ceph version 0.94.3 (95cefea9fd9ab740263bf8bb4796fd864d9afe2b)

Cluster config
# cat /etc/ceph/ceph.conf 
[global]
	auth service required = cephx
	auth client required = cephx
	auth cluster required = cephx
	fsid = 0f05deaf-ee6f-4342-b589-5ecf5527aa6f
	mon osd full ratio = .95
	mon osd nearfull ratio = .90
	osd pool default size = 2
	osd pool default min size = 1
	osd pool default pg num = 32
	osd pool default pgp num = 32
	max open files = 131072
	osd crush chooseleaf type = 1
[mds]

[mds.a]
	host = ak34
	
[mon]
	mon_initial_members = a,b

[mon.a]
	host = ak34
	mon addr  = 172.24.32.134:6789

[mon.b]
	host = ak35
	mon addr  = 172.24.32.135:6789

[osd]
	osd journal size = 1000

[osd.0]
	osd uuid = b3b3cd37-8df5-4455-8104-006ddba2c443
	host = ak34
	public addr  = 172.24.32.134
	osd journal = /CEPH_JOURNAL/osd/ceph-0/journal
.....


Below tree of cluster
# ceph osd tree
ID WEIGHT   TYPE NAME                       UP/DOWN REWEIGHT PRIMARY-AFFINITY 
-1 45.75037 root default                                                      
-2 45.75037     region RU                                                     
-3 45.75037         datacenter ru-msk-ak48t                                   
-4 22.87518             host ak34                                             
 0  1.90627                 osd.0                up  1.00000          1.00000 
 1  1.90627                 osd.1                up  1.00000          1.00000 
 2  1.90627                 osd.2                up  1.00000          1.00000 
 3  1.90627                 osd.3                up  1.00000          1.00000 
 4  1.90627                 osd.4                up  1.00000          1.00000 
 5  1.90627                 osd.5                up  1.00000          1.00000 
 6  1.90627                 osd.6                up  1.00000          1.00000 
 7  1.90627                 osd.7                up  1.00000          1.00000 
 8  1.90627                 osd.8                up  1.00000          1.00000 
 9  1.90627                 osd.9                up  1.00000          1.00000 
10  1.90627                 osd.10               up  1.00000          1.00000 
11  1.90627                 osd.11               up  1.00000          1.00000 
-5 22.87518             host ak35                                             
12  1.90627                 osd.12               up  1.00000          1.00000 
13  1.90627                 osd.13               up  1.00000          1.00000 
14  1.90627                 osd.14               up  1.00000          1.00000 
15  1.90627                 osd.15               up  1.00000          1.00000 
16  1.90627                 osd.16               up  1.00000          1.00000 
17  1.90627                 osd.17               up  1.00000          1.00000 
18  1.90627                 osd.18               up  1.00000          1.00000 
19  1.90627                 osd.19               up  1.00000          1.00000 
20  1.90627                 osd.20               up  1.00000          1.00000 
21  1.90627                 osd.21               up  1.00000          1.00000 
22  1.90627                 osd.22               up  1.00000          1.00000 
23  1.90627                 osd.23               up  1.00000          1.00000 

Status of cluster
# ceph -s
    cluster 0f05deaf-ee6f-4342-b589-5ecf5527aa6f
     health HEALTH_OK
     monmap e1: 2 mons at {a=172.24.32.134:6789/0,b=172.24.32.135:6789/0}
            election epoch 10, quorum 0,1 a,b
     mdsmap e14: 1/1/1 up {0=a=up:active}
     osdmap e194: 24 osds: 24 up, 24 in
      pgmap v2305: 384 pgs, 3 pools, 271 GB data, 72288 objects
            545 GB used, 44132 GB / 44678 GB avail
                 384 active+clean


Pools for cephfs
]# ceph osd dump|grep pg
pool 1 'cephfs_data' replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 256 pgp_num 256 last_change 154 flags hashpspool crash_replay_interval 45 stripe_width 0
pool 2 'cephfs_metadata' replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 144 flags hashpspool stripe_width 0

Rados bench
# rados bench -p cephfs_data 300 write --no-cleanup && rados bench -p cephfs_data 300 seq 
 Maintaining 16 concurrent writes of 4194304 bytes for up to 300 seconds or 0 objects
 Object prefix: benchmark_data_XXXXXXXXXXXXXXXXXXXX_8108
   sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
     0       0         0         0         0         0         -         0
     1      16       170       154    615.74       616  0.109984 0.0978277
     2      16       335       319   637.817       660 0.0623079 0.0985001
     3      16       496       480   639.852       644 0.0992808 0.0982317
     4      16       662       646   645.862       664 0.0683485 0.0980203
     5      16       831       815   651.796       676 0.0773545 0.0973635
     6      15       994       979   652.479       656  0.112323  0.096901
     7      16      1164      1148   655.826       676  0.107592 0.0969845
     8      16      1327      1311   655.335       652 0.0960067 0.0968445
     9      16      1488      1472   654.066       644 0.0780589 0.0970879

.....
   297      16     43445     43429   584.811       596 0.0569516  0.109399
   298      16     43601     43585   584.942       624 0.0707439  0.109388
   299      16     43756     43740   585.059       620   0.20408  0.109363
2015-10-15 14:16:59.622610min lat: 0.0109677 max lat: 0.951389 avg lat: 0.109344
   sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
   300      13     43901     43888   585.082       592 0.0768806  0.109344
 Total time run:        300.329089
Total reads made:     43901
Read size:            4194304
Bandwidth (MB/sec):    584.705 

Average Latency:       0.109407
Max latency:           0.951389
Min latency:           0.0109677

But real write speed is very low

# dd if=/dev/zero|pv|dd oflag=direct of=44444 bs=4k count=10k
10240+0 records in1.5MiB/s] [                                                                     <=>                                                                     ]
10240+0 records out
41943040 bytes (42 MB) copied, 25.9155 s, 1.6 MB/s
40.1MiB 0:00:25 [1.55MiB/s] [                                                                       <=>                                                                   ]

# dd if=/dev/zero|pv|dd oflag=direct of=44444 bs=32k count=10k
10240+0 records in0.5MiB/s] [                                                                             <=>                                                             ]
10240+0 records out
335544320 bytes (336 MB) copied, 28.2998 s, 11.9 MB/s
 320MiB 0:00:28 [11.3MiB/s] [                                                                                <=>                                                          ]


Do you know of root cause of low speed of write to FS?

Thank you for help in advance!!

-- 
Best Regards,
Stanislav Butkeev
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux