Hi, for read benchmark with fio, what is the iodepth ? my fio 4k randr results with iodepth=1 : bw=6795.1KB/s, iops=1698 iodepth=2 : bw=14608KB/s, iops=3652 iodepth=4 : bw=32686KB/s, iops=8171 iodepth=8 : bw=76175KB/s, iops=19043 iodepth=16 :bw=173651KB/s, iops=43412 iodepth=32 :bw=336719KB/s, iops=84179 (This should be similar with rados bench -t (threads) option). This is normal because of network latencies + ceph latencies. Doing more parallism increase iops. (doing a bench with "dd" = iodepth=1) Theses result are with 1 client/rbd volume. now with more fio client (numjobs=X) I can reach up to 300kiops with 8-10 clients. This should be the same with lauching multiple rados bench in parallel (BTW, it could be great to have an option in rados bench to do it) ----- Mail original ----- De: "Jacek Jarosiewicz" <jjarosiewicz@xxxxxxxxxxxxx> À: "Mark Nelson" <mnelson@xxxxxxxxxx>, "ceph-users" <ceph-users@xxxxxxxxxxxxxx> Envoyé: Jeudi 18 Juin 2015 11:49:11 Objet: Re: rbd performance issue - can't find bottleneck On 06/17/2015 04:19 PM, Mark Nelson wrote: >> SSD's are INTEL SSDSC2BW240A4 > > Ah, if I'm not mistaken that's the Intel 530 right? You'll want to see > this thread by Stefan Priebe: > > https://www.mail-archive.com/ceph-users@xxxxxxxxxxxxxx/msg05667.html > > In fact it was the difference in Intel 520 and Intel 530 performance > that triggered many of the different investigations that have taken > place by various folks into SSD flushing behavior on ATA_CMD_FLUSH. The > gist of it is that the 520 is very fast but probably not safe. The 530 > is safe but not fast. The DC S3700 (and similar drives with super > capacitors) are thought to be both fast and safe (though some drives > like the crucual M500 and later misrepresented their power loss > protection so you have to be very careful!) > Yes, these are Intel 530. I did the tests described in the thread You pasted and unfortunately that's my case... I think. The dd run locally on a mounted ssd partition looks like this: [root@cf02 journal]# dd if=/dev/zero of=test bs=350k count=10000 oflag=direct,dsync 10000+0 records in 10000+0 records out 3584000000 bytes (3.6 GB) copied, 211.698 s, 16.9 MB/s and when I skip the flag dsync it goes fast: [root@cf02 journal]# dd if=/dev/zero of=test bs=350k count=10000 oflag=direct 10000+0 records in 10000+0 records out 3584000000 bytes (3.6 GB) copied, 9.05432 s, 396 MB/s (I used the same 350k block size as mentioned in the e-mail from the thread above) I tried disabling the dsync like this: [root@cf02 ~]# echo temporary write through > /sys/class/scsi_disk/1\:0\:0\:0/cache_type [root@cf02 ~]# cat /sys/class/scsi_disk/1\:0\:0\:0/cache_type write through ..and then locally I see the speedup: [root@cf02 journal]# dd if=/dev/zero of=test bs=350k count=10000 oflag=direct,dsync 10000+0 records in 10000+0 records out 3584000000 bytes (3.6 GB) copied, 10.4624 s, 343 MB/s ..but when I test it from a client I still get slow results: root@cf03:/ceph/tmp# dd if=/dev/zero of=test bs=100M count=100 oflag=direct 100+0 records in 100+0 records out 10485760000 bytes (10 GB) copied, 122.482 s, 85.6 MB/s and fio gives the same 2-3k iops. after the change to SSD cache_type I tried remounting the test image, recreating it and so on - nothing helped. I ran rbd bench-write on it, and it's not good either: root@cf03:~# rbd bench-write t2 bench-write io_size 4096 io_threads 16 bytes 1073741824 pattern seq SEC OPS OPS/SEC BYTES/SEC 1 4221 4220.64 32195919.35 2 9628 4813.95 36286083.00 3 15288 4790.90 35714620.49 4 19610 4902.47 36626193.93 5 24844 4968.37 37296562.14 6 30488 5081.31 38112444.88 7 36152 5164.54 38601615.10 8 41479 5184.80 38860207.38 9 46971 5218.70 39181437.52 10 52219 5221.77 39322641.34 11 56666 5151.36 38761566.30 12 62073 5172.71 38855021.35 13 65962 5073.95 38182880.49 14 71541 5110.02 38431536.17 15 77039 5135.85 38615125.42 16 82133 5133.31 38692578.98 17 87657 5156.24 38849948.84 18 92943 5141.03 38635464.85 19 97528 5133.03 38628548.32 20 103100 5154.99 38751359.30 21 108952 5188.09 38944016.94 22 114511 5205.01 38999594.18 23 120319 5231.17 39138227.64 24 125975 5248.92 39195739.46 25 131438 5257.50 39259023.06 26 136883 5264.72 39344673.41 27 142362 5272.66 39381638.20 elapsed: 27 ops: 143789 ops/sec: 5273.01 bytes/sec: 39376124.30 rados bench gives: root@cf03:~# rados -p rbd bench 30 write --no-cleanup Maintaining 16 concurrent writes of 4194304 bytes for up to 30 seconds or 0 objects Object prefix: benchmark_data_cf03_21194 sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 0 0 0 0 0 0 - 0 1 16 28 12 47.9863 48 0.779211 0.48964 2 16 43 27 53.9886 60 1.17958 0.775733 3 16 59 43 57.322 64 0.157145 0.798348 4 16 73 57 56.9897 56 0.424493 0.862553 5 16 89 73 58.39 64 0.246444 0.893064 6 16 104 88 58.6569 60 1.67389 0.901757 7 16 120 104 59.4186 64 1.78324 0.935242 8 16 132 116 57.9905 48 1.50035 0.963947 9 16 147 131 58.2128 60 1.85047 0.978697 10 16 161 145 57.9908 56 0.133187 0.999999 11 16 174 158 57.4455 52 1.59548 1.02264 12 16 189 173 57.6577 60 0.179966 1.01623 13 16 206 190 58.4526 68 1.93064 1.02108 14 16 221 205 58.5624 60 1.54504 1.02566 15 16 236 220 58.6578 60 1.69023 1.0301 16 16 251 235 58.7411 60 1.5683 1.02514 17 16 263 247 58.1089 48 1.99782 1.0293 18 16 278 262 58.2136 60 2.03487 1.03552 19 16 295 279 58.7282 68 0.292065 1.03412 20 16 310 294 58.7913 60 1.61331 1.0436 21 16 323 307 58.4675 52 0.161555 1.04393 22 16 335 319 57.9914 48 1.55905 1.05392 23 16 351 335 58.2523 64 0.317811 1.04937 24 16 369 353 58.8247 72 1.76145 1.05415 25 16 383 367 58.7114 56 1.25224 1.05758 26 16 399 383 58.9145 64 1.46604 1.05593 27 16 414 398 58.9544 60 0.349479 1.04213 28 16 431 415 59.2771 68 0.74857 1.04895 29 16 448 432 59.5776 68 1.16596 1.04986 30 16 464 448 59.7247 64 0.195269 1.04202 31 16 465 449 57.9271 4 1.25089 1.04249 Total time run: 31.407987 Total writes made: 465 Write size: 4194304 Bandwidth (MB/sec): 59.221 Stddev Bandwidth: 15.5579 Max bandwidth (MB/sec): 72 Min bandwidth (MB/sec): 0 Average Latency: 1.07412 Stddev Latency: 0.691676 Max latency: 2.52896 Min latency: 0.113751 and reading: root@cf03:/ceph/tmp# rados -p rbd bench 30 rand sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 0 0 0 0 0 0 - 0 1 16 43 27 107.964 108 0.650441 0.415883 2 16 71 55 109.972 112 0.624493 0.485735 3 16 100 84 111.975 116 0.77036 0.518524 4 16 128 112 111.977 112 0.329123 0.522431 5 16 155 139 111.179 108 0.702401 0.538305 6 16 184 168 111.979 116 0.7502 0.543431 7 16 213 197 112.551 116 0.46755 0.547047 8 16 240 224 111.981 108 0.430872 0.548855 9 16 268 252 111.981 112 0.740558 0.550753 10 16 297 281 112.381 116 0.340352 0.551335 11 16 325 309 112.345 112 1.14164 0.544646 12 16 353 337 112.315 112 0.46038 0.555206 13 16 382 366 112.597 116 0.727224 0.556029 14 16 410 394 112.553 112 0.673523 0.557172 15 16 438 422 112.516 112 0.543171 0.558385 16 16 466 450 112.482 112 0.370119 0.557367 17 16 494 478 112.453 112 0.89322 0.556681 18 16 522 506 112.427 112 0.651126 0.559601 19 16 551 535 112.614 116 0.801207 0.55739 20 16 579 563 112.583 112 0.92365 0.558744 21 16 607 591 112.554 112 0.679443 0.55983 22 16 635 619 112.528 112 0.273806 0.557695 23 16 664 648 112.679 116 0.33258 0.559718 24 15 691 676 112.65 112 0.141288 0.559192 25 16 720 704 112.623 112 0.901803 0.559435 26 16 748 732 112.598 112 0.807202 0.559793 27 16 776 760 112.576 112 0.747424 0.561044 28 16 805 789 112.698 116 0.817418 0.560835 29 16 833 817 112.673 112 0.711397 0.562342 30 16 861 845 112.65 112 0.520696 0.562809 Total time run: 30.547818 Total reads made: 861 Read size: 4194304 Bandwidth (MB/sec): 112.741 Average Latency: 0.566574 Max latency: 1.2147 Min latency: 0.06128 so.. in order to increase performance, do I need to change the ssd drives? J -- Jacek Jarosiewicz Administrator Systemów Informatycznych ---------------------------------------------------------------------------------------- SUPERMEDIA Sp. z o.o. z siedzibą w Warszawie ul. Senatorska 13/15, 00-075 Warszawa Sąd Rejonowy dla m.st.Warszawy, XII Wydział Gospodarczy Krajowego Rejestru Sądowego, nr KRS 0000029537; kapitał zakładowy 42.756.000 zł NIP: 957-05-49-503 Adres korespondencyjny: ul. Jubilerska 10, 04-190 Warszawa ---------------------------------------------------------------------------------------- SUPERMEDIA -> http://www.supermedia.pl dostep do internetu - hosting - kolokacja - lacza - telefonia _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com