Read Stalls with Multiple OSD Servers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

 

I’m running a three server cluster (one monitor, two OSD) and am having a problem where after adding the second OSD server, my read rate drops significantly and eventually the reads stall (writes are improved as expected). Attached is a log of the rados benchmarks for the two configurations and below is my hardware configuration. I’m not using replicas (capacity is more important than uptime for our use case) and am using a single 10GbE network. The pool (rbd) is configured with 128 placement groups.

 

I’ve checked the CPU utilization of the ceph-osd processes and they all hover around 10% until the stall. After the stall, the CPU usage is 0% and the disks all show zero operations via iostat. Iperf reports 9.9Gb/s between the monitor and OSD servers.

 

I’m looking for any advice/help on how to identify the source of this issue as my attempts so far have proven fruitless…

 

Monitor server:

2x E5-2680V3

32GB DDR4

2x 4TB HDD in RAID1 on an Avago/LSI 3108 with Cachevault, configured as write-back

10GbE

 

OSD servers:

2x E5-2680V3

128GB DDR4

2x 8+2 RAID6 using 8TB SAS12 drives on an Avago/LSI 9380 controller with Cachevault, configured as write-back.

                - Each RAID6 is an OSD

10GbE

 

Thanks,

 

Tom Helander

 

KLA-Tencor
One Technology Dr | M/S 5-2042R | Milpitas, CA | 95035

 

CONFIDENTIALITY NOTICE: This e-mail transmission, and any documents, files or previous e-mail messages attached to it, may contain confidential information. If you are not the intended recipient, or a person responsible for delivering it to the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of any of the information contained in or attached to this message is STRICTLY PROHIBITED. If you have received this transmission in error, please immediately notify us by reply e-mail at thomas.helander@xxxxxxxxxxxxxx or by telephone at (408) 875-7819, and destroy the original transmission and its attachments without reading them or saving them to disk. Thank you.

 

[root@control01 system_initialize]# cat /etc/ceph/ceph.conf 
[global]
  fsid = 9501b36d-0ac0-44ec-b740-9906d092bac2
  mon initial members = control01
  mon host = 192.168.2.11:6789
  auth client required = cephx
  auth cluster required = cephx
  auth service required = cephx
  cluster network = 192.168.2.0/23
  osd crush chooseleaf type = 1
  osd pool default min size = 1
  osd pool default pg num = 128
  osd pool default size = 1
  public network = 192.168.2.0/23

[root@control01 system_initialize]# ceph -s
    cluster 9501b36d-0ac0-44ec-b740-9906d092bac2
     health HEALTH_OK
     monmap e1: 1 mons at {control01=192.168.2.11:6789/0}
            election epoch 2, quorum 0 control01
     osdmap e40: 2 osds: 2 up, 2 in
            flags sortbitwise
      pgmap v1826: 128 pgs, 1 pools, 0 bytes data, 0 objects
            102 MB used, 116 TB / 116 TB avail
                 128 active+clean
				 
[root@control01 system_initialize]# ceph osd tree
ID WEIGHT    TYPE NAME          UP/DOWN REWEIGHT PRIMARY-AFFINITY 
-1 116.41660 root default                                         
-2 116.41660     host storage01                                   
 0  58.20830         osd.0           up  1.00000          1.00000 
 1  58.20830         osd.1           up  1.00000          1.00000
				 
[root@control01 system_initialize]# rados -p rbd bench 60 write --no-cleanup
Maintaining 16 concurrent writes of 4194304 bytes for up to 60 seconds
Object prefix: benchmark_data_control01.adc_16939
  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
    0       0         0         0         0         0         -         0
    1      16       180       164   655.951       656 0.0712128 0.0943403
    2      16       323       307   613.933       572 0.0132957 0.0882554
    3      16       421       405   539.946       392 0.0121464  0.105845
    4      16       503       487   486.951       3280.00786829  0.117621
    5      16       650       634   507.149       588 0.0344602  0.125399
    6      16       730       714   475.952       320 0.0465193  0.133839
    7      16       804       788   450.241       296 0.0704427  0.133889
    8      16       952       936   467.954       592  0.103373  0.135459
    9      16      1078      1062   471.956       504  0.172915  0.134309
   10      16      1166      1150   459.956       352 0.0232164  0.136015
   11      16      1282      1266    460.32       464  0.226732  0.137086
   12      16      1378      1362   453.956       384  0.273416  0.140061
   13      16      1504      1488   457.802       504 0.0169217  0.138367
   14      16      1597      1581   451.671       372 0.0261604  0.140748
   15      16      1693      1677   447.157       384   0.30309  0.142337
   16      16      1774      1758   439.458       324 0.0165604  0.142555
   17      16      1913      1897    446.31       556 0.0215074  0.141801
   18      16      2001      1985   441.068       352 0.0326713  0.143268
   19      16      2117      2101   442.272       464 0.0631854  0.144277
2016-08-01 08:46:56.636292 min lat: 0.00786046 max lat: 0.796235 avg lat: 0.144196
  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
   20      16      2213      2197   439.357       384  0.062006  0.144196
   21      16      2338      2322   442.243       500 0.0504776  0.144552
   22      16      2431      2415   439.049       372  0.145096  0.144634
   23      16      2503      2487    432.48       288 0.0134286  0.147554
   24      16      2626      2610   434.958       492 0.0166429  0.146283
   25      16      2736      2720   435.158       440  0.158851  0.146813
   26      16      2843      2827   434.881       428  0.178634  0.146422
   27      16      2933      2917   432.106       360 0.0309586  0.147309
   28      16      3034      3018   431.101       404 0.0252305  0.147202
   29      16      3163      3147   434.027       516 0.0161699  0.146168
   30      16      3267      3251   433.424       416 0.0226611  0.147184
   31      16      3319      3303   426.152       208 0.0152707  0.149782
   32      16      3425      3409   426.083       424  0.650789  0.148421
   33      16      3583      3567   432.322       632   0.11366  0.147673
   34      16      3679      3663   430.899       384 0.0202234  0.147264
   35      16      3796      3780   431.958       468 0.0145311  0.147476
   36      16      3889      3873   430.291       372 0.0385374  0.148281
   37      16      4018      4002   432.607       516  0.333854  0.147036
   38      16      4130      4114   433.011       448 0.0200742  0.147417
   39      16      4240      4224   433.189       440  0.182814  0.147335
2016-08-01 08:47:16.638234 min lat: 0.00767328 max lat: 1.06087 avg lat: 0.147074
  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
   40      16      4350      4334   433.358       440 0.0622486  0.147074
   41      16      4444      4428   431.958       376  0.320389   0.14782
   42      16      4575      4559   434.148       524 0.0394409   0.14701
   43      16      4665      4649   432.423       360  0.330111  0.147809
   44      16      4760      4744   431.231       380 0.0339363  0.147463
   45      16      4888      4872   433.024       512 0.0162028  0.147562
   46      16      4975      4959   431.176       348 0.0587093  0.148274
   47      16      5054      5038   428.725       316  0.142593  0.148945
   48      16      5172      5156   429.625       472 0.0571224  0.148851
   49      16      5246      5230   426.897       296 0.0103518  0.149479
   50      16      5370      5354   428.278       496 0.0387389  0.148274
   51      16      5485      5469   428.899       460 0.0471625  0.148994
   52      16      5571      5555   427.266       344  0.286425  0.149251
   53      16      5708      5692   429.543       548 0.0409842  0.148532
   54      16      5747      5731   424.477       156   1.02458  0.149638
   55      16      5896      5880   427.594       596  0.054289   0.14954
   56      16      6003      5987     427.6       428  0.378443  0.149219
   57      16      6130      6114    429.01       508  0.134211  0.149041
   58      16      6203      6187   426.647       2920.00769267  0.149129
   59      16      6327      6311   427.822       496  0.136888  0.149386
2016-08-01 08:47:36.640301 min lat: 0.00766789 max lat: 1.07989 avg lat: 0.14986
  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
   60      16      6410      6394   426.224       332 0.0289986   0.14986
Total time run:         60.224132
Total writes made:      6411
Write size:             4194304
Bandwidth (MB/sec):     426
Stddev Bandwidth:       102
Max bandwidth (MB/sec): 656
Min bandwidth (MB/sec): 156
Average IOPS:           106
Stddev IOPS:            25
Max IOPS:               164
Min IOPS:               39
Average Latency:        0.15
Stddev Latency:         0.148
Max latency:            1.08
Min latency:            0.00767

[root@control01 system_initialize]# ssh storage01 'echo 3 > /proc/sys/vm/drop_caches'

[root@control01 system_initialize]# rados -p rbd bench 60 seq  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
    0       1         1         0         0         0         -         0
    1      16       145       129   515.906       516 0.0167339   0.11488
    2      16       287       271   541.922       568 0.0522996  0.114998
    3      16       430       414   551.926       572 0.0524102  0.112196
    4      16       582       566    565.93       608 0.0769078  0.110808
    5      16       725       709   567.131       572 0.0392817  0.109597
    6      16       879       863   575.265       616  0.139403  0.109623
    7      16      1030      1014   579.361       604 0.0608292  0.108439
    8      16      1176      1160   579.932       584 0.0363153  0.108604
    9      16      1325      1309    581.71       596 0.0522732  0.108434
   10      16      1461      1445   577.932       544  0.192011  0.109137
   11      16      1618      1602   582.478       628 0.0856128  0.108291
   12      16      1770      1754     584.6       608 0.0983951   0.10812
   13      15      1925      1910   587.612       624  0.249809  0.107948
   14      16      2074      2058   587.921       592 0.0328021  0.107558
   15      15      2232      2217   591.099       636 0.0865074  0.107138
   16      16      2391      2375   593.651       632 0.0583012  0.106681
   17      16      2541      2525   594.021       600 0.0598446   0.10664
   18      16      2689      2673   593.905       592  0.150494  0.106773
   19      16      2829      2813   592.119       560  0.135322  0.107046
2016-08-01 08:49:13.364855 min lat: 0.00725 max lat: 0.474863 avg lat: 0.10682
  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
   20      16      2986      2970    593.91       628  0.140958   0.10682
   21      16      3129      3113   592.864       572 0.0663896  0.106913
   22      16      3292      3276    595.55       652 0.0777842  0.106593
   23      16      3446      3430   596.436       616 0.0902701  0.106483
   24      16      3582      3566   594.249       544 0.0267909  0.106545
   25      16      3725      3709   593.358       572  0.202035  0.106958
   26      16      3878      3862   594.073       612  0.075774  0.106808
   27      16      4029      4013   594.439       604  0.040735  0.106731
   28      16      4194      4178   596.778       660 0.0535983  0.106471
   29      16      4346      4330   597.164       608 0.0733419  0.106291
   30      16      4488      4472    596.19       568 0.0503257  0.106554
   31      16      4636      4620   596.053       592 0.0621321  0.106497
   32      16      4793      4777    597.05       628  0.148522  0.106465
   33      15      4944      4929   597.376       608 0.0346483  0.106377
   34      16      5099      5083   597.922       616  0.217152  0.106319
   35      16      5251      5235   598.208       608  0.038321  0.106254
   36      15      5395      5380   597.695       580 0.0631969  0.106301
   37      16      5546      5530   597.756       600  0.307726  0.106317
   38      16      5691      5675   597.288       580 0.0822542  0.106466
   39      16      5848      5832   598.074       628 0.0899488  0.106331
2016-08-01 08:49:33.367138 min lat: 0.00675095 max lat: 0.474863 avg lat: 0.106496
  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
   40      16      5989      5973   597.221       564  0.139705  0.106496
   41      16      6141      6125   597.482       608  0.116482   0.10645
   42      16      6295      6279   597.922       616  0.158778  0.106378
Total time run:       42.887748
Total reads made:     6411
Read size:            4194304
Bandwidth (MB/sec):   598
Average IOPS          149
Stddev IOPS:          7
Max IOPS:             165
Min IOPS:             129
Average Latency:      0.106
Max latency:          0.475
Min latency:          0.00675

[root@control01 system_initialize]# rados -p rbd cleanup

### ADDED SECOND SERVER HERE

[root@control01 system_initialize]# ceph -s
    cluster 9501b36d-0ac0-44ec-b740-9906d092bac2
     health HEALTH_OK
     monmap e1: 1 mons at {control01=192.168.2.11:6789/0}
            election epoch 2, quorum 0 control01
     osdmap e48: 4 osds: 4 up, 4 in
            flags sortbitwise
      pgmap v1881: 128 pgs, 1 pools, 0 bytes data, 0 objects
            178 MB used, 232 TB / 232 TB avail
                 128 active+clean
				 
[root@control01 system_initialize]# ceph osd tree
ID WEIGHT    TYPE NAME          UP/DOWN REWEIGHT PRIMARY-AFFINITY 
-1 232.83319 root default                                         
-2 116.41660     host storage01                                   
 0  58.20830         osd.0           up  1.00000          1.00000 
 1  58.20830         osd.1           up  1.00000          1.00000 
-3 116.41660     host storage02                                   
 2  58.20830         osd.2           up  1.00000          1.00000 
 3  58.20830         osd.3           up  1.00000          1.00000
 
[root@control01 system_initialize]# rados -p rbd bench 60 write --no-cleanup
Maintaining 16 concurrent writes of 4194304 bytes for up to 60 seconds
Object prefix: benchmark_data_control01.adc_19527
  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
    0       0         0         0         0         0         -         0
    1      16       295       279   1115.89      1116 0.0914904 0.0553005
    2      16       571       555   1109.87      1104 0.0793885 0.0558428
    3      16       785       769   1025.21       856 0.0806213 0.0607349
    4      15       940       925   924.889       624 0.0963824 0.0686032
    5      16      1142      1126   900.693       804 0.0120744 0.0689868
    6      16      1260      1244   829.236       472  0.559114 0.0733209
    7      16      1518      1502   858.186      1032  0.060122 0.0739932
    8      16      1685      1669   834.404       668 0.0257534 0.0762295
    9      16      1907      1891   840.349       888 0.0351589  0.076069
   10      16      2107      2091   836.305       800 0.0154241 0.0760561
   11      16      2245      2229   810.451       5520.00834552 0.0780325
   12      16      2462      2446   815.237       868  0.103543 0.0783068
   13      16      2707      2691   827.903       980 0.0453284 0.0771251
   14      16      2927      2911   831.618       880 0.0565513 0.0767363
   15      16      3091      3075   819.907       656 0.0160208 0.0779284
   16      16      3310      3294   823.406       876  0.154445  0.077495
   17      16      3498      3482     819.2       752 0.0576891 0.0770361
   18      16      3763      3747   832.571      1060  0.062403 0.0767189
   19      16      3920      3904   821.802       628 0.0191069 0.0771242
2016-08-01 08:57:40.542756 min lat: 0.00750826 max lat: 0.73383 avg lat: 0.0784191
  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
   20      16      4092      4076   815.109       688 0.0379247 0.0784191
   21      16      4343      4327   824.098      1004 0.0087996 0.0773914
   22      16      4512      4496   817.363       676 0.0221131 0.0780507
   23      16      4756      4740   824.256       976  0.035069 0.0775215
   24      16      4920      4904   817.242       656 0.0079845 0.0774722
   25      16      5107      5091    814.47       748 0.0561364 0.0784492
   26      16      5278      5262   809.449       684 0.0123023 0.0787802
   27      16      5449      5433     804.8       684  0.054684 0.0793674
   28      16      5592      5576   796.484       5720.00835733 0.0791427
   29      16      5695      5679   783.225       412 0.0182057 0.0816075
   30      16      5953      5937   791.513      1032  0.188799 0.0806221
   31      16      6173      6157   794.364       880 0.0728068 0.0797684
   32      16      6413      6397   799.537       960   0.02882 0.0797985
   33      16      6627      6611   801.245       856 0.0589031   0.07979
   34      16      6835      6819   802.147       832 0.0112783 0.0794941
   35      16      7003      6987   798.427       672 0.0374195 0.0800639
   36      16      7198      7182   797.912       780 0.0740644 0.0801101
   37      16      7414      7398   799.696       864 0.0614531 0.0799438
   38      16      7638      7622   802.227       8960.00811641 0.0792574
   39      15      7788      7773   797.143       604 0.0887847 0.0802261
2016-08-01 08:58:00.544911 min lat: 0.00748072 max lat: 1.24578 avg lat: 0.0794329
  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
   40      16      8036      8020   801.912       988  0.033339 0.0794329
   41      16      8183      8167   796.693       588 0.0193636 0.0802842
   42      16      8427      8411    800.96       976 0.0465702 0.0797765
   43      16      8611      8595   799.447       736 0.0105132 0.0798697
   44      16      8813      8797   799.639       808 0.0372847 0.0799672
   45      16      9009      8993   799.289       784   0.30762  0.079848
   46      16      9172      9156   796.086       652 0.0613054 0.0803336
   47      16      9383      9367   797.103       844 0.0201026 0.0801689
   48      16      9575      9559   796.495       768 0.0155817 0.0802785
   49      16      9780      9764   796.973       820  0.013651  0.080104
   50      16      9909      9893   791.352       516 0.0864205 0.0807965
   51      16     10181     10165   797.167      1088 0.0426923 0.0802229
   52      16     10378     10362   796.989       788 0.0719918 0.0802605
   53      16     10626     10610   800.666       992 0.0485357 0.0798906
   54      16     10767     10751   796.282       564 0.0108626 0.0802162
   55      16     11004     10988   799.039       9480.00754213 0.0798239
   56      16     11174     11158   796.912       680 0.0712432 0.0799379
   57      16     11400     11384    798.79       904 0.0263897 0.0799824
   58      16     11599     11583    798.74       796 0.0557034 0.0799678
   59      16     11814     11798   799.777       860 0.0232177 0.0797825
2016-08-01 08:58:20.547054 min lat: 0.00739155 max lat: 1.24578 avg lat: 0.0798717
  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
   60      15     12020     12005   800.246       828  0.207976 0.0798717
Total time run:         60.052154
Total writes made:      12020
Write size:             4194304
Bandwidth (MB/sec):     801
Stddev Bandwidth:       165
Max bandwidth (MB/sec): 1.12e+03
Min bandwidth (MB/sec): 412
Average IOPS:           200
Stddev IOPS:            41
Max IOPS:               279
Min IOPS:               103
Average Latency:        0.0799
Stddev Latency:         0.104
Max latency:            1.25
Min latency:            0.00739

[root@control01 system_initialize]# ssh storage01 'echo 3 > /proc/sys/vm/drop_caches'
[root@control01 system_initialize]# ssh storage02 'echo 3 > /proc/sys/vm/drop_caches'

[root@control01 system_initialize]# rados -p rbd bench 60 seq  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
    0       0         0         0         0         0         -         0
    1      16       121       105   419.921       420 0.0231405  0.110763
    2      16       153       137   273.957       128 0.0425251  0.220774
    3      16       193       177   235.967       160  0.994681  0.215014
    4      15       258       243   242.954       264  0.789814  0.246963
    5      15       314       299   239.137       224 0.0538332  0.254025
    6      16       339       323   215.282        96 0.0373792   0.26461
    7      16       369       353    201.67       120 0.0605347  0.289337
    8      16       404       388    193.96       140 0.0139568  0.313728
    9      16       434       418   185.742       120  0.214502  0.318417
   10      16       467       451   180.366       132  0.673079  0.335801
   11      16       493       477   173.423       104  0.241314  0.338439
   12      16       499       483   160.972        24 0.0253943  0.335559
   13      16       499       483    148.59         0         -  0.335559
   14      16       499       483   137.977         0         -  0.335559
   15      16       499       483   128.779         0         -  0.335559
   16      16       499       483   120.731         0         -  0.335559
   17      16       499       483   113.629         0         -  0.335559
   18      16       499       483   107.317         0         -  0.335559
   19      16       499       483   101.669         0         -  0.335559
2016-08-01 09:01:15.596789 min lat: 0.00710772 max lat: 1.71721 avg lat: 0.335559
  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
   20      16       499       483   96.5857         0         -  0.335559
   21      16       499       483   91.9864         0         -  0.335559
   22      16       499       483   87.8053         0         -  0.335559
   23      16       499       483   83.9879         0         -  0.335559
   24      16       499       483   80.4884         0         -  0.335559
   25      16       499       483    77.269         0         -  0.335559
   26      16       499       483   74.2975         0         -  0.335559
   27      16       499       483   71.5458         0         -  0.335559
   28      16       499       483   68.9906         0         -  0.335559
   29      16       499       483   66.6117         0         -  0.335559
   30      16       499       483   64.3914         0         -  0.335559
   31      16       499       483   62.3143         0         -  0.335559
   32      16       499       483    60.367         0         -  0.335559
   33      16       499       483   58.5378         0         -  0.335559
   34      16       499       483   56.8162         0         -  0.335559
   35      16       499       483   55.1928         0         -  0.335559
   36      16       499       483   53.6597         0         -  0.335559
   37      16       499       483   52.2095         0         -  0.335559
   38      16       499       483   50.8356         0         -  0.335559
   39      16       499       483   49.5322         0         -  0.335559
2016-08-01 09:01:35.598904 min lat: 0.00710772 max lat: 1.71721 avg lat: 0.335559
  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
   40      16       499       483   48.2939         0         -  0.335559
   41      16       499       483    47.116         0         -  0.335559
   42      16       499       483   45.9942         0         -  0.335559
   43      16       499       483   44.9246         0         -  0.335559
   44      16       499       483   43.9036         0         -  0.335559
   45      16       499       483   42.9279         0         -  0.335559
   46      16       499       483   41.9948         0         -  0.335559
   47      16       499       483   41.1013         0         -  0.335559
   48      16       499       483    40.245         0         -  0.335559
   49      16       499       483   39.4237         0         -  0.335559
   50      16       499       483   38.6352         0         -  0.335559
   51      16       499       483   37.8776         0         -  0.335559
   52      16       499       483   37.1492         0         -  0.335559
   53      16       499       483   36.4483         0         -  0.335559
   54      16       499       483   35.7733         0         -  0.335559
   55      16       499       483   35.1229         0         -  0.335559
   56      16       499       483   34.4957         0         -  0.335559
   57      16       499       483   33.8905         0         -  0.335559
   58      16       499       483   33.3062         0         -  0.335559
   59      16       499       483   32.7417         0         -  0.335559
2016-08-01 09:01:55.601246 min lat: 0.00710772 max lat: 1.71721 avg lat: 0.335559
  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
   60      16       499       483    32.196         0         -  0.335559
   61      16       499       483   31.6682         0         -  0.335559
^C

[root@control01 system_initialize]# ceph -s
    cluster 9501b36d-0ac0-44ec-b740-9906d092bac2
     health HEALTH_OK
     monmap e1: 1 mons at {control01=192.168.2.11:6789/0}
            election epoch 2, quorum 0 control01
     osdmap e48: 4 osds: 4 up, 4 in
            flags sortbitwise
      pgmap v1915: 128 pgs, 1 pools, 48080 MB data, 12021 objects
            48393 MB used, 232 TB / 232 TB avail
                 128 active+clean
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux