Hi Mark,
The results are below. These numbers look good but I'm not really sure what to conclude now.
# rados -p performance_test bench 120 write -b 4194304 -t 100 --no-cleanup
Total time run: 120.133251
Total writes made: 17529
Write size: 4194304
Bandwidth (MB/sec): 583.652
Stddev Bandwidth: 269.76
Max bandwidth (MB/sec): 884
Min bandwidth (MB/sec): 0
Average Latency: 0.68418
Stddev Latency: 0.552344
Max latency: 5.06959
Min latency: 0.121746
Total time run: 58.451831
Total reads made: 17529
Read size: 4194304
Bandwidth (MB/sec): 1199.552
Average Latency: 0.332538
Max latency: 3.72943
Min latency: 0.007074
On Wed, Nov 19, 2014 at 8:55 PM, Mark Nelson <mark.nelson@xxxxxxxxxxx> wrote:
On 11/19/2014 06:51 PM, Jay Janardhan wrote:
Can someone help me what I can tune to improve the performance? The
cluster is pushing data at about 13 MB/s with a single copy of data
while the underlying disks can push 100+MB/s.
Can anyone help me with this?
*rados bench results:*
Concurrency Replication size Write(MB/s) Seq Read(MB/s)
321 13.532.8
32212.732.0
3236.130.2
*Commands I used (Pool size was updated appropriately):*
rados -p performance_test bench 120 write -b 8192 -t 100 --no-cleanup
rados -p performance_test bench 120 seq -t 100
How's performance if you do:
rados -p performance_test bench 120 write -b 4194304 -t 100 --no-cleanup
and
rados -p performance_test bench 120 seq -b 4194304 -t 100
instead?
Mark
1) *Disk tests - All have similar numbers:*
# dd if=/dev/zero of=here bs=1G count=1 oflag=direct
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 10.0691 s, 107 MB/s
2) *10G network is not holding up*
# iperf -c 10.13.10.15 -i2 -t 10
------------------------------------------------------------
Client connecting to 10.13.10.15, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[ 3] local 10.13.30.13 port 56459 connected with 10.13.10.15 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0- 2.0 sec 2.17 GBytes 9.33 Gbits/sec
[ 3] 2.0- 4.0 sec 2.18 GBytes 9.37 Gbits/sec
[ 3] 4.0- 6.0 sec 2.18 GBytes 9.37 Gbits/sec
[ 3] 6.0- 8.0 sec 2.18 GBytes 9.38 Gbits/sec
[ 3] 8.0-10.0 sec 2.18 GBytes 9.37 Gbits/sec
[ 3] 0.0-10.0 sec 10.9 GBytes 9.36 Gbits/sec
*3) Ceph Status*
# ceph health
HEALTH_OK
root@us1-r04u05s01-ceph:~# ceph status
cluster 5e95b6fa-0b99-4c31-8aa9-7a88h6hc5eda
health HEALTH_OK
monmap e4: 4 mons at
{us1-r01u05s01-ceph=10.1.30.10:6789/0,us1-r01u09s01-ceph=10.1.30.11:6789/0,us1-r04u05s01-ceph=10.1.30.14:6789/0,us1-r04u09s01-ceph=10.1.30.15:6789/0
<http://10.1.30.10:6789/0,us1-r01u09s01-ceph=10.1.30.11:6789/0,us1-r04u05s01-ceph=10.1.30.14:6789/0,us1-r04u09s01-ceph=10.1.30.15:6789/0>},
election epoch 78, quorum 0,1,2,3
us1-r01u05s01-ceph,us1-r01u09s01-ceph,us1-r04u05s01-ceph,us1-r04u09s01-ceph
osdmap e1029: 97 osds: 97 up, 97 in
pgmap v1850869: 12480 pgs, 6 pools, 587 GB data, 116 kobjects
1787 GB used, 318 TB / 320 TB avail
12480 active+clean
client io 0 B/s rd, 25460 B/s wr, 20 op/s
*4) Ceph configuration*cluster_network = 10.2.0.0/16 <http://10.2.0.0/16>
# cat ceph.conf
[global]
auth cluster required = cephx
auth service required = cephx
auth client required = cephx
cephx require signatures = True
cephx cluster require signatures = True
cephx service require signatures = False
fsid = 5e95b6fa-0b99-4c31-8aa9-7a88h6hc5eda
osd pool default pg num = 4096
osd pool default pgp num = 4096
osd pool default size = 3
osd pool default min size = 1
osd pool default crush rule = 0
# Disable in-memory logs
debug_lockdep = 0/0
debug_context = 0/0
debug_crush = 0/0
debug_buffer = 0/0
debug_timer = 0/0
debug_filer = 0/0
debug_objecter = 0/0
debug_rados = 0/0
debug_rbd = 0/0
debug_journaler = 0/0
debug_objectcatcher = 0/0
debug_client = 0/0
debug_osd = 0/0
debug_optracker = 0/0
debug_objclass = 0/0
debug_filestore = 0/0
debug_journal = 0/0
debug_ms = 0/0
debug_monc = 0/0
debug_tp = 0/0
debug_auth = 0/0
debug_finisher = 0/0
debug_heartbeatmap = 0/0
debug_perfcounter = 0/0
debug_asok = 0/0
debug_throttle = 0/0
debug_mon = 0/0
debug_paxos = 0/0
debug_rgw = 0/0
[mon]
mon osd down out interval = 600
mon osd min down reporters = 2
[mon.us1-r01u05s01-ceph]
host = us1-r01u05s01-ceph
mon addr = 10.1.30.10
[mon.us1-r01u09s01-ceph]
host = us1-r01u09s01-ceph
mon addr = 10.1.30.11
[mon.us1-r04u05s01-ceph]
host = us1-r04u05s01-ceph
mon addr = 10.1.30.14
[mon.us1-r04u09s01-ceph]
host = us1-r04u09s01-ceph
mon addr = 10.1.30.15
[osd]
osd mkfs type = xfs
osd mkfs options xfs = -f -i size=2048
osd mount options xfs = noatime
osd journal size = 10000
public_network = 10.1.0.0/16 <http://10.1.0.0/16>
osd mon heartbeat interval = 30
# Performance tuning
filestore merge threshold = 40
filestore split multiple = 8
osd op threads = 8
filestore op threads = 8
filestore max sync interval = 5
osd max scrubs = 1
# Recovery tuning
osd recovery max active = 5
osd max backfills = 2
osd recovery op priority = 2
osd recovery max chunk = 8388608
osd recovery threads = 1
osd objectstore = filestore
osd crush update on start = true
[mds]
*5) Ceph OSDs/Crushmap**6) OSDs from one of the cluster nodes (rest are similar)*
# ceph osd tree
# idweighttype nameup/downreweight
-145.46root fusion_drives
-115.46rack rack01-fusion
-72.73host us1-r01u25s01-compf-fusion
892.73osd.89up1
-92.73host us1-r01u23s01-compf-fusion
962.73osd.96up1
-13315.2root sata_drives
-10166rack rack01-sata
-276.44host us1-r01u05s01-ceph
03.64osd.0up1
13.64osd.1up1
23.64osd.2up1
33.64osd.3up1
43.64osd.4up1
53.64osd.5up1
63.64osd.6up1
73.64osd.7up1
83.64osd.8up1
93.64osd.9up1
103.64osd.10up1
113.64osd.11up1
123.64osd.12up1
133.64osd.13up1
143.64osd.14up1
153.64osd.15up1
163.64osd.16up1
173.64osd.17up1
183.64osd.18up1
193.64osd.19up1
203.64osd.20up1
-376.44host us1-r01u09s01-ceph
213.64osd.21up1
223.64osd.22up1
233.64osd.23up1
243.64osd.24up1
253.64osd.25up1
263.64osd.26up1
273.64osd.27up1
283.64osd.28up1
293.64osd.29up1
303.64osd.30up1
313.64osd.31up1
323.64osd.32up1
333.64osd.33up1
343.64osd.34up1
353.64osd.35up1
363.64osd.36up1
373.64osd.37up1
383.64osd.38up1
393.64osd.39up1
403.64osd.40up1
413.64osd.41up1
-66.54host us1-r01u25s01-compf-sata
831.09osd.83up1
841.09osd.84up1
851.09osd.85up1
861.09osd.86up1
871.09osd.87up1
881.09osd.88up1
-86.54host us1-r01u23s01-compf-sata
901.09osd.90up1
911.09osd.91up1
921.09osd.92up1
931.09osd.93up1
941.09osd.94up1
951.09osd.95up1
-12149.2rack rack04-sata
-472.8host us1-r04u05s01-ceph
423.64osd.42up1
433.64osd.43up1
443.64osd.44up1
453.64osd.45up1
463.64osd.46up1
473.64osd.47up1
483.64osd.48up1
493.64osd.49up1
503.64osd.50up1
513.64osd.51up1
523.64osd.52up1
533.64osd.53up1
543.64osd.54up1
553.64osd.55up1
563.64osd.56up1
573.64osd.57up1
583.64osd.58up1
593.64osd.59up1
603.64osd.60up1
613.64osd.61up1
-576.44host us1-r04u09s01-ceph
623.64osd.62up1
633.64osd.63up1
643.64osd.64up1
653.64osd.65up1
663.64osd.66up1
673.64osd.67up1
683.64osd.68up1
693.64osd.69up1
703.64osd.70up1
713.64osd.71up1
723.64osd.72up1
733.64osd.73up1
743.64osd.74up1
753.64osd.75up1
763.64osd.76up1
773.64osd.77up1
783.64osd.78up1
793.64osd.79up1
803.64osd.80up1
813.64osd.81up1
823.64osd.82up1
*6) Journal Files (there are TWO SSDs)*
/dev/sda1 3905109820 16741944 3888367876 1%
/var/lib/ceph/osd/ceph-42
/dev/sdb1 3905109820 19553976 3885555844 1%
/var/lib/ceph/osd/ceph-43
/dev/sdc1 3905109820 18081680 3887028140 1%
/var/lib/ceph/osd/ceph-44
/dev/sdd1 3905109820 19070596 3886039224 1%
/var/lib/ceph/osd/ceph-45
/dev/sde1 3905109820 17949284 3887160536 1%
/var/lib/ceph/osd/ceph-46
/dev/sdf1 3905109820 18538344 3886571476 1%
/var/lib/ceph/osd/ceph-47
/dev/sdg1 3905109820 17792608 3887317212 1%
/var/lib/ceph/osd/ceph-48
/dev/sdh1 3905109820 20910976 3884198844 1%
/var/lib/ceph/osd/ceph-49
/dev/sdi1 3905109820 19683208 3885426612 1%
/var/lib/ceph/osd/ceph-50
/dev/sdj1 3905109820 20115236 3884994584 1%
/var/lib/ceph/osd/ceph-51
/dev/sdk1 3905109820 19152812 3885957008 1%
/var/lib/ceph/osd/ceph-52
/dev/sdm1 3905109820 18701728 3886408092 1%
/var/lib/ceph/osd/ceph-53
/dev/sdn1 3905109820 19603536 3885506284 1%
/var/lib/ceph/osd/ceph-54
/dev/sdo1 3905109820 20164928 3884944892 1%
/var/lib/ceph/osd/ceph-55
/dev/sdp1 3905109820 19093024 3886016796 1%
/var/lib/ceph/osd/ceph-56
/dev/sdq1 3905109820 18699344 3886410476 1%
/var/lib/ceph/osd/ceph-57
/dev/sdr1 3905109820 19267068 3885842752 1%
/var/lib/ceph/osd/ceph-58
/dev/sds1 3905109820 19745212 3885364608 1%
/var/lib/ceph/osd/ceph-59
/dev/sdt1 3905109820 16321696 3888788124 1%
/var/lib/ceph/osd/ceph-60
/dev/sdu1 3905109820 19154884 3885954936 1%
/var/lib/ceph/osd/ceph-61
# parted /dev/sdy print
Model: ATA SanDisk SD7UB2Q5 (scsi)
Disk /dev/sdy: 512GB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Number Start End Size File system Name Flags
1 1049kB 10.5GB 10.5GB ceph journal
2 10.5GB 21.0GB 10.5GB ceph journal
3 21.0GB 31.5GB 10.5GB ceph journal
4 31.5GB 41.9GB 10.5GB ceph journal
5 41.9GB 52.4GB 10.5GB ceph journal
6 52.4GB 62.9GB 10.5GB ceph journal
7 62.9GB 73.4GB 10.5GB ceph journal
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com