Ceph performance - 10 times slower

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Can someone help me what I can tune to improve the performance? The cluster is pushing data at about 13 MB/s with a single copy of data while the underlying disks can push 100+MB/s. 

Can anyone help me with this?


rados bench results:

Concurrency Replication size       Write(MB/s)     Seq Read(MB/s)
32 1 13.5 32.8
32 2 12.7 32.0
32 3 6.1 30.2

Commands I used (Pool size was updated appropriately):

rados -p performance_test bench 120 write -b 8192 -t 100 --no-cleanup
rados -p performance_test bench 120 seq -t 100



1) Disk tests - All have similar numbers:
# dd if=/dev/zero of=here bs=1G count=1 oflag=direct
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 10.0691 s, 107 MB/s

2) 10G network is not holding up
# iperf -c 10.13.10.15  -i2 -t 10

------------------------------------------------------------

Client connecting to 10.13.10.15, TCP port 5001

TCP window size: 85.0 KByte (default)

------------------------------------------------------------

[  3] local 10.13.30.13 port 56459 connected with 10.13.10.15 port 5001

[ ID] Interval       Transfer     Bandwidth

[  3]  0.0- 2.0 sec  2.17 GBytes  9.33 Gbits/sec

[  3]  2.0- 4.0 sec  2.18 GBytes  9.37 Gbits/sec

[  3]  4.0- 6.0 sec  2.18 GBytes  9.37 Gbits/sec

[  3]  6.0- 8.0 sec  2.18 GBytes  9.38 Gbits/sec

[  3]  8.0-10.0 sec  2.18 GBytes  9.37 Gbits/sec

[  3]  0.0-10.0 sec  10.9 GBytes  9.36 Gbits/sec



3) Ceph Status

# ceph health
HEALTH_OK
root@us1-r04u05s01-ceph:~# ceph status
    cluster 5e95b6fa-0b99-4c31-8aa9-7a88h6hc5eda
     health HEALTH_OK
     monmap e4: 4 mons at {us1-r01u05s01-ceph=10.1.30.10:6789/0,us1-r01u09s01-ceph=10.1.30.11:6789/0,us1-r04u05s01-ceph=10.1.30.14:6789/0,us1-r04u09s01-ceph=10.1.30.15:6789/0}, election epoch 78, quorum 0,1,2,3 us1-r01u05s01-ceph,us1-r01u09s01-ceph,us1-r04u05s01-ceph,us1-r04u09s01-ceph
     osdmap e1029: 97 osds: 97 up, 97 in
      pgmap v1850869: 12480 pgs, 6 pools, 587 GB data, 116 kobjects
            1787 GB used, 318 TB / 320 TB avail
               12480 active+clean
  client io 0 B/s rd, 25460 B/s wr, 20 op/s


4) Ceph configuration
# cat ceph.conf 

[global]
  auth cluster required = cephx
  auth service required = cephx
  auth client required = cephx
  cephx require signatures = True
  cephx cluster require signatures = True
  cephx service require signatures = False
  fsid = 5e95b6fa-0b99-4c31-8aa9-7a88h6hc5eda
  osd pool default pg num = 4096
  osd pool default pgp num = 4096
  osd pool default size = 3
  osd pool default min size = 1
  osd pool default crush rule = 0
  # Disable in-memory logs
  debug_lockdep = 0/0
  debug_context = 0/0
  debug_crush = 0/0
  debug_buffer = 0/0
  debug_timer = 0/0
  debug_filer = 0/0
  debug_objecter = 0/0
  debug_rados = 0/0
  debug_rbd = 0/0
  debug_journaler = 0/0
  debug_objectcatcher = 0/0
  debug_client = 0/0
  debug_osd = 0/0
  debug_optracker = 0/0
  debug_objclass = 0/0
  debug_filestore = 0/0
  debug_journal = 0/0
  debug_ms = 0/0
  debug_monc = 0/0
  debug_tp = 0/0
  debug_auth = 0/0
  debug_finisher = 0/0
  debug_heartbeatmap = 0/0
  debug_perfcounter = 0/0
  debug_asok = 0/0
  debug_throttle = 0/0
  debug_mon = 0/0
  debug_paxos = 0/0
  debug_rgw = 0/0

[mon]
  mon osd down out interval = 600
  mon osd min down reporters = 2
    [mon.us1-r01u05s01-ceph]
    host = us1-r01u05s01-ceph
    mon addr = 10.1.30.10
      [mon.us1-r01u09s01-ceph]
    host = us1-r01u09s01-ceph
    mon addr = 10.1.30.11
      [mon.us1-r04u05s01-ceph]
    host = us1-r04u05s01-ceph
    mon addr = 10.1.30.14
      [mon.us1-r04u09s01-ceph]
    host = us1-r04u09s01-ceph
    mon addr = 10.1.30.15
  
[osd]
  osd mkfs type = xfs
  osd mkfs options xfs = -f -i size=2048
  osd mount options xfs = noatime
  osd journal size = 10000
  cluster_network = 10.2.0.0/16
  public_network = 10.1.0.0/16
  osd mon heartbeat interval = 30
  # Performance tuning
  filestore merge threshold = 40
  filestore split multiple = 8
  osd op threads = 8
  filestore op threads = 8
  filestore max sync interval = 5
  osd max scrubs = 1
  # Recovery tuning
  osd recovery max active = 5
  osd max backfills = 2
  osd recovery op priority = 2
  osd recovery max chunk = 8388608
  osd recovery threads = 1
  osd objectstore = filestore
  osd crush update on start = true

[mds]



5) Ceph OSDs/Crushmap

# ceph osd tree
# id weight type name up/down reweight
-14 5.46 root fusion_drives
-11 5.46 rack rack01-fusion
-7 2.73 host us1-r01u25s01-compf-fusion
89 2.73 osd.89 up 1
-9 2.73 host us1-r01u23s01-compf-fusion
96 2.73 osd.96 up 1
-13 315.2 root sata_drives
-10 166 rack rack01-sata
-2 76.44 host us1-r01u05s01-ceph
0 3.64 osd.0 up 1
1 3.64 osd.1 up 1
2 3.64 osd.2 up 1
3 3.64 osd.3 up 1
4 3.64 osd.4 up 1
5 3.64 osd.5 up 1
6 3.64 osd.6 up 1
7 3.64 osd.7 up 1
8 3.64 osd.8 up 1
9 3.64 osd.9 up 1
10 3.64 osd.10 up 1
11 3.64 osd.11 up 1
12 3.64 osd.12 up 1
13 3.64 osd.13 up 1
14 3.64 osd.14 up 1
15 3.64 osd.15 up 1
16 3.64 osd.16 up 1
17 3.64 osd.17 up 1
18 3.64 osd.18 up 1
19 3.64 osd.19 up 1
20 3.64 osd.20 up 1
-3 76.44 host us1-r01u09s01-ceph
21 3.64 osd.21 up 1
22 3.64 osd.22 up 1
23 3.64 osd.23 up 1
24 3.64 osd.24 up 1
25 3.64 osd.25 up 1
26 3.64 osd.26 up 1
27 3.64 osd.27 up 1
28 3.64 osd.28 up 1
29 3.64 osd.29 up 1
30 3.64 osd.30 up 1
31 3.64 osd.31 up 1
32 3.64 osd.32 up 1
33 3.64 osd.33 up 1
34 3.64 osd.34 up 1
35 3.64 osd.35 up 1
36 3.64 osd.36 up 1
37 3.64 osd.37 up 1
38 3.64 osd.38 up 1
39 3.64 osd.39 up 1
40 3.64 osd.40 up 1
41 3.64 osd.41 up 1
-6 6.54 host us1-r01u25s01-compf-sata
83 1.09 osd.83 up 1
84 1.09 osd.84 up 1
85 1.09 osd.85 up 1
86 1.09 osd.86 up 1
87 1.09 osd.87 up 1
88 1.09 osd.88 up 1
-8 6.54 host us1-r01u23s01-compf-sata
90 1.09 osd.90 up 1
91 1.09 osd.91 up 1
92 1.09 osd.92 up 1
93 1.09 osd.93 up 1
94 1.09 osd.94 up 1
95 1.09 osd.95 up 1
-12 149.2 rack rack04-sata
-4 72.8 host us1-r04u05s01-ceph
42 3.64 osd.42 up 1
43 3.64 osd.43 up 1
44 3.64 osd.44 up 1
45 3.64 osd.45 up 1
46 3.64 osd.46 up 1
47 3.64 osd.47 up 1
48 3.64 osd.48 up 1
49 3.64 osd.49 up 1
50 3.64 osd.50 up 1
51 3.64 osd.51 up 1
52 3.64 osd.52 up 1
53 3.64 osd.53 up 1
54 3.64 osd.54 up 1
55 3.64 osd.55 up 1
56 3.64 osd.56 up 1
57 3.64 osd.57 up 1
58 3.64 osd.58 up 1
59 3.64 osd.59 up 1
60 3.64 osd.60 up 1
61 3.64 osd.61 up 1
-5 76.44 host us1-r04u09s01-ceph
62 3.64 osd.62 up 1
63 3.64 osd.63 up 1
64 3.64 osd.64 up 1
65 3.64 osd.65 up 1
66 3.64 osd.66 up 1
67 3.64 osd.67 up 1
68 3.64 osd.68 up 1
69 3.64 osd.69 up 1
70 3.64 osd.70 up 1
71 3.64 osd.71 up 1
72 3.64 osd.72 up 1
73 3.64 osd.73 up 1
74 3.64 osd.74 up 1
75 3.64 osd.75 up 1
76 3.64 osd.76 up 1
77 3.64 osd.77 up 1
78 3.64 osd.78 up 1
79 3.64 osd.79 up 1
80 3.64 osd.80 up 1
81 3.64 osd.81 up 1
82 3.64 osd.82 up 1


6) OSDs from one of the cluster nodes (rest are similar)
/dev/sda1                   3905109820 16741944 3888367876   1% /var/lib/ceph/osd/ceph-42
/dev/sdb1                   3905109820 19553976 3885555844   1% /var/lib/ceph/osd/ceph-43
/dev/sdc1                   3905109820 18081680 3887028140   1% /var/lib/ceph/osd/ceph-44
/dev/sdd1                   3905109820 19070596 3886039224   1% /var/lib/ceph/osd/ceph-45
/dev/sde1                   3905109820 17949284 3887160536   1% /var/lib/ceph/osd/ceph-46
/dev/sdf1                   3905109820 18538344 3886571476   1% /var/lib/ceph/osd/ceph-47
/dev/sdg1                   3905109820 17792608 3887317212   1% /var/lib/ceph/osd/ceph-48
/dev/sdh1                   3905109820 20910976 3884198844   1% /var/lib/ceph/osd/ceph-49
/dev/sdi1                   3905109820 19683208 3885426612   1% /var/lib/ceph/osd/ceph-50
/dev/sdj1                   3905109820 20115236 3884994584   1% /var/lib/ceph/osd/ceph-51
/dev/sdk1                   3905109820 19152812 3885957008   1% /var/lib/ceph/osd/ceph-52
/dev/sdm1                   3905109820 18701728 3886408092   1% /var/lib/ceph/osd/ceph-53
/dev/sdn1                   3905109820 19603536 3885506284   1% /var/lib/ceph/osd/ceph-54
/dev/sdo1                   3905109820 20164928 3884944892   1% /var/lib/ceph/osd/ceph-55
/dev/sdp1                   3905109820 19093024 3886016796   1% /var/lib/ceph/osd/ceph-56
/dev/sdq1                   3905109820 18699344 3886410476   1% /var/lib/ceph/osd/ceph-57
/dev/sdr1                   3905109820 19267068 3885842752   1% /var/lib/ceph/osd/ceph-58
/dev/sds1                   3905109820 19745212 3885364608   1% /var/lib/ceph/osd/ceph-59
/dev/sdt1                   3905109820 16321696 3888788124   1% /var/lib/ceph/osd/ceph-60
/dev/sdu1                   3905109820 19154884 3885954936   1% /var/lib/ceph/osd/ceph-61


6) Journal Files (there are TWO SSDs)
# parted /dev/sdy print
Model: ATA SanDisk SD7UB2Q5 (scsi)
Disk /dev/sdy: 512GB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt

Number  Start   End     Size    File system  Name          Flags
 1      1049kB  10.5GB  10.5GB               ceph journal
 2      10.5GB  21.0GB  10.5GB               ceph journal
 3      21.0GB  31.5GB  10.5GB               ceph journal
 4      31.5GB  41.9GB  10.5GB               ceph journal
 5      41.9GB  52.4GB  10.5GB               ceph journal
 6      52.4GB  62.9GB  10.5GB               ceph journal
 7      62.9GB  73.4GB  10.5GB               ceph journal
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux