Re: rbd performance issue - can't find bottleneck

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06/18/2015 04:49 AM, Jacek Jarosiewicz wrote:
On 06/17/2015 04:19 PM, Mark Nelson wrote:
SSD's are INTEL SSDSC2BW240A4

Ah, if I'm not mistaken that's the Intel 530 right?  You'll want to see
this thread by Stefan Priebe:

https://www.mail-archive.com/ceph-users@xxxxxxxxxxxxxx/msg05667.html

In fact it was the difference in Intel 520 and Intel 530 performance
that triggered many of the different investigations that have taken
place by various folks into SSD flushing behavior on ATA_CMD_FLUSH.  The
gist of it is that the 520 is very fast but probably not safe.  The 530
is safe but not fast.  The DC S3700 (and similar drives with super
capacitors) are thought to be both fast and safe (though some drives
like the crucual M500 and later misrepresented their power loss
protection so you have to be very careful!)


Yes, these are Intel 530.
I did the tests described in the thread You pasted and unfortunately
that's my case... I think.

The dd run locally on a mounted ssd partition looks like this:

[root@cf02 journal]# dd if=/dev/zero of=test bs=350k count=10000
oflag=direct,dsync
10000+0 records in
10000+0 records out
3584000000 bytes (3.6 GB) copied, 211.698 s, 16.9 MB/s

and when I skip the flag dsync it goes fast:

[root@cf02 journal]# dd if=/dev/zero of=test bs=350k count=10000
oflag=direct
10000+0 records in
10000+0 records out
3584000000 bytes (3.6 GB) copied, 9.05432 s, 396 MB/s

(I used the same 350k block size as mentioned in the e-mail from the
thread above)

I tried disabling the dsync like this:

[root@cf02 ~]# echo temporary write through >
/sys/class/scsi_disk/1\:0\:0\:0/cache_type

[root@cf02 ~]# cat /sys/class/scsi_disk/1\:0\:0\:0/cache_type
write through

..and then locally I see the speedup:

[root@cf02 journal]# dd if=/dev/zero of=test bs=350k count=10000
oflag=direct,dsync
10000+0 records in
10000+0 records out
3584000000 bytes (3.6 GB) copied, 10.4624 s, 343 MB/s


..but when I test it from a client I still get slow results:

root@cf03:/ceph/tmp# dd if=/dev/zero of=test bs=100M count=100 oflag=direct
100+0 records in
100+0 records out
10485760000 bytes (10 GB) copied, 122.482 s, 85.6 MB/s

and fio gives the same 2-3k iops.

after the change to SSD cache_type I tried remounting the test image,
recreating it and so on - nothing helped.

I ran rbd bench-write on it, and it's not good either:

root@cf03:~# rbd bench-write t2
bench-write  io_size 4096 io_threads 16 bytes 1073741824 pattern seq
   SEC       OPS   OPS/SEC   BYTES/SEC
     1      4221   4220.64  32195919.35
     2      9628   4813.95  36286083.00
     3     15288   4790.90  35714620.49
     4     19610   4902.47  36626193.93
     5     24844   4968.37  37296562.14
     6     30488   5081.31  38112444.88
     7     36152   5164.54  38601615.10
     8     41479   5184.80  38860207.38
     9     46971   5218.70  39181437.52
    10     52219   5221.77  39322641.34
    11     56666   5151.36  38761566.30
    12     62073   5172.71  38855021.35
    13     65962   5073.95  38182880.49
    14     71541   5110.02  38431536.17
    15     77039   5135.85  38615125.42
    16     82133   5133.31  38692578.98
    17     87657   5156.24  38849948.84
    18     92943   5141.03  38635464.85
    19     97528   5133.03  38628548.32
    20    103100   5154.99  38751359.30
    21    108952   5188.09  38944016.94
    22    114511   5205.01  38999594.18
    23    120319   5231.17  39138227.64
    24    125975   5248.92  39195739.46
    25    131438   5257.50  39259023.06
    26    136883   5264.72  39344673.41
    27    142362   5272.66  39381638.20
elapsed:    27  ops:   143789  ops/sec:  5273.01  bytes/sec: 39376124.30

rados bench gives:

root@cf03:~# rados -p rbd bench 30 write --no-cleanup
  Maintaining 16 concurrent writes of 4194304 bytes for up to 30 seconds
or 0 objects
  Object prefix: benchmark_data_cf03_21194
    sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
      0       0         0         0         0         0         -         0
      1      16        28        12   47.9863        48  0.779211   0.48964
      2      16        43        27   53.9886        60   1.17958  0.775733
      3      16        59        43    57.322        64  0.157145  0.798348
      4      16        73        57   56.9897        56  0.424493  0.862553
      5      16        89        73     58.39        64  0.246444  0.893064
      6      16       104        88   58.6569        60   1.67389  0.901757
      7      16       120       104   59.4186        64   1.78324  0.935242
      8      16       132       116   57.9905        48   1.50035  0.963947
      9      16       147       131   58.2128        60   1.85047  0.978697
     10      16       161       145   57.9908        56  0.133187  0.999999
     11      16       174       158   57.4455        52   1.59548   1.02264
     12      16       189       173   57.6577        60  0.179966   1.01623
     13      16       206       190   58.4526        68   1.93064   1.02108
     14      16       221       205   58.5624        60   1.54504   1.02566
     15      16       236       220   58.6578        60   1.69023    1.0301
     16      16       251       235   58.7411        60    1.5683   1.02514
     17      16       263       247   58.1089        48   1.99782    1.0293
     18      16       278       262   58.2136        60   2.03487   1.03552
     19      16       295       279   58.7282        68  0.292065   1.03412
     20      16       310       294   58.7913        60   1.61331    1.0436
     21      16       323       307   58.4675        52  0.161555   1.04393
     22      16       335       319   57.9914        48   1.55905   1.05392
     23      16       351       335   58.2523        64  0.317811   1.04937
     24      16       369       353   58.8247        72   1.76145   1.05415
     25      16       383       367   58.7114        56   1.25224   1.05758
     26      16       399       383   58.9145        64   1.46604   1.05593
     27      16       414       398   58.9544        60  0.349479   1.04213
     28      16       431       415   59.2771        68   0.74857   1.04895
     29      16       448       432   59.5776        68   1.16596   1.04986
     30      16       464       448   59.7247        64  0.195269   1.04202
     31      16       465       449   57.9271         4   1.25089   1.04249
  Total time run:         31.407987
Total writes made:      465
Write size:             4194304
Bandwidth (MB/sec):     59.221

Stddev Bandwidth:       15.5579
Max bandwidth (MB/sec): 72
Min bandwidth (MB/sec): 0
Average Latency:        1.07412
Stddev Latency:         0.691676
Max latency:            2.52896
Min latency:            0.113751

and reading:

root@cf03:/ceph/tmp# rados -p rbd bench 30 rand
    sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
      0       0         0         0         0         0         -         0
      1      16        43        27   107.964       108  0.650441  0.415883
      2      16        71        55   109.972       112  0.624493  0.485735
      3      16       100        84   111.975       116   0.77036  0.518524
      4      16       128       112   111.977       112  0.329123  0.522431
      5      16       155       139   111.179       108  0.702401  0.538305
      6      16       184       168   111.979       116    0.7502  0.543431
      7      16       213       197   112.551       116   0.46755  0.547047
      8      16       240       224   111.981       108  0.430872  0.548855
      9      16       268       252   111.981       112  0.740558  0.550753
     10      16       297       281   112.381       116  0.340352  0.551335
     11      16       325       309   112.345       112   1.14164  0.544646
     12      16       353       337   112.315       112   0.46038  0.555206
     13      16       382       366   112.597       116  0.727224  0.556029
     14      16       410       394   112.553       112  0.673523  0.557172
     15      16       438       422   112.516       112  0.543171  0.558385
     16      16       466       450   112.482       112  0.370119  0.557367
     17      16       494       478   112.453       112   0.89322  0.556681
     18      16       522       506   112.427       112  0.651126  0.559601
     19      16       551       535   112.614       116  0.801207   0.55739
     20      16       579       563   112.583       112   0.92365  0.558744
     21      16       607       591   112.554       112  0.679443   0.55983
     22      16       635       619   112.528       112  0.273806  0.557695
     23      16       664       648   112.679       116   0.33258  0.559718
     24      15       691       676    112.65       112  0.141288  0.559192
     25      16       720       704   112.623       112  0.901803  0.559435
     26      16       748       732   112.598       112  0.807202  0.559793
     27      16       776       760   112.576       112  0.747424  0.561044
     28      16       805       789   112.698       116  0.817418  0.560835
     29      16       833       817   112.673       112  0.711397  0.562342
     30      16       861       845    112.65       112  0.520696  0.562809
  Total time run:        30.547818
Total reads made:     861
Read size:            4194304
Bandwidth (MB/sec):    112.741

Average Latency:       0.566574
Max latency:           1.2147
Min latency:           0.06128


so.. in order to increase performance, do I need to change the ssd drives?

I'm just guessing, but because your read performance is slow as well, you may multiple issues going on. The Intel 530 being slow at O_DSYNC writes is one of them, but it's possible there is something else too. If I were in your position I think I'd try to beg/borrw/steal a single DC S3700 or even 520 (despite it's presumed lack of safety) and just see how a single OSD cluster using it does on your setup before replacing everything.


J

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux