Re: Ceph read / write : Terrible performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am I the only one who finds it funny that the "ceph problem" was fixed by an update to the disk controller firmware? :-)

Ian

On Thu, Sep 3, 2015 at 11:13 AM, Vickey Singh <vickey.singh22693@xxxxxxxxx> wrote:
Hey Mark / Community

These are the sequences of changes that seems to have fixed the ceph problem

1#  Upgrading Disk controller firmware from 6.34 to 6.64  ( latest )
2# Rebooting all nodes in order to make new firmware into effect

Read and write operations are now normal as well as system load and CPU utilization

- Vickey -


On Wed, Sep 2, 2015 at 11:28 PM, Vickey Singh <vickey.singh22693@xxxxxxxxx> wrote:
Thank You Mark , please see my response below.

On Wed, Sep 2, 2015 at 5:23 PM, Mark Nelson <mnelson@xxxxxxxxxx> wrote:
On 09/02/2015 08:51 AM, Vickey Singh wrote:
Hello Ceph Experts

I have a strange problem , when i am reading or writing to Ceph pool ,
its not writing properly. Please notice Cur MB/s which is going up and down

--- Ceph Hammer 0.94.2
-- CentOS 6, 2.6
-- Ceph cluster is healthy

You might find that CentOS7 gives you better performance.  In some cases we were seeing nearly 2X.

Wooo 2X , i would definitely plan for upgrade. Thanks
 




One interesting thing is when every i start rados bench command for read
or write CPU Idle % goes down ~10 and System load is increasing like
anything.

Hardware

HpSL4540

Please make sure the controller is on the newest firmware.  There used to be a bug that would cause sequential write performance to bottleneck when writeback cache was enabled on the RAID controller.

Last month i have upgraded the firmwares for this hardware , so i hope they are up to date.
 


32Core CPU
196G Memory
10G Network

Be sure to check the network too.  We've seen a lot of cases where folks have been burned by one of the NICs acting funky.

At a first view , Interface looks good and they are pushing data nicely ( what ever they are getting )
 


I don't think hardware is a problem.

Please give me clues / pointers , how should i troubleshoot this problem.



# rados bench -p glance-test 60 write
  Maintaining 16 concurrent writes of 4194304 bytes for up to 60 seconds
or 0 objects
  Object prefix: benchmark_data_pouta-s01.pouta.csc.fi_2173350
    sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
      0       0         0         0         0         0         -         0
      1      16        20         4     15.99        16   0.12308   0.10001
      2      16        37        21   41.9841        68   1.79104  0.827021
      3      16        68        52   69.3122       124  0.084304  0.854829
      4      16       114        98   97.9746       184   0.12285  0.614507
      5      16       188       172   137.568       296  0.210669  0.449784
      6      16       248       232   154.634       240  0.090418  0.390647
      7      16       305       289    165.11       228  0.069769  0.347957
      8      16       331       315   157.471       104  0.026247    0.3345
      9      16       361       345   153.306       120  0.082861  0.320711
     10      16       380       364   145.575        76  0.027964  0.310004
     11      16       393       377   137.067        52   3.73332  0.393318
     12      16       448       432   143.971       220  0.334664  0.415606
     13      16       476       460   141.508       112  0.271096  0.406574
     14      16       497       481   137.399        84  0.257794  0.412006
     15      16       507       491   130.906        40   1.49351  0.428057
     16      16       529       513   115.042        88  0.399384   0.48009
     17      16       533       517   94.6286        16   5.50641  0.507804
     18      16       537       521    83.405        16   4.42682  0.549951
     19      16       538       522    80.349         4   11.2052  0.570363
2015-09-02 09:26:18.398641min lat: 0.023851 max lat: 11.2052 avg lat:
0.570363
    sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
     20      16       538       522   77.3611         0         -  0.570363
     21      16       540       524   74.8825         4   8.88847  0.591767
     22      16       542       526   72.5748         8   1.41627  0.593555
     23      16       543       527   70.2873         4    8.0856  0.607771
     24      16       555       539   69.5674        48  0.145199  0.781685
     25      16       560       544   68.0177        20    1.4342  0.787017
     26      16       564       548   66.4241        16  0.451905   0.78765
     27      16       566       550   64.7055         8  0.611129  0.787898
     28      16       570       554   63.3138        16   2.51086  0.797067
     29      16       570       554   61.5549         0         -  0.797067
     30      16       572       556   60.1071         4   7.71382  0.830697
     31      16       577       561   59.0515        20   23.3501  0.916368
     32      16       590       574   58.8705        52  0.336684  0.956958
     33      16       591       575   57.4986         4   1.92811  0.958647
     34      16       591       575   56.0961         0         -  0.958647
     35      16       591       575   54.7603         0         -  0.958647
     36      16       597       581   54.0447         8  0.187351   1.00313
     37      16       625       609   52.8394       112   2.12256   1.09256
     38      16       631       615    52.227        24   1.57413   1.10206
     39      16       638       622   51.7232        28   4.41663   1.15086
2015-09-02 09:26:40.510623min lat: 0.023851 max lat: 27.6704 avg lat:
1.15657
    sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
     40      16       652       636   51.8102        56  0.113345   1.15657
     41      16       682       666   53.1443       120  0.041251   1.17813
     42      16       685       669   52.3395        12  0.501285   1.17421
     43      15       690       675   51.7955        24   2.26605   1.18357
     44      16       728       712   53.6062       148  0.589826   1.17478
     45      16       728       712   52.6158         0         -   1.17478
     46      16       728       712   51.6613         0         -   1.17478
     47      16       728       712   50.7407         0         -   1.17478
     48      16       772       756   52.9332        44  0.234811    1.1946
     49      16       835       819   56.3577       252   5.67087   1.12063
     50      16       890       874   59.1252       220  0.230806   1.06778
     51      16       896       880   58.5409        24  0.382471   1.06121
     52      16       896       880   57.5832         0         -   1.06121
     53      16       896       880   56.6562         0         -   1.06121
     54      16       896       880   55.7587         0         -   1.06121
     55      16       897       881   54.9515         1   4.88333   1.06554
     56      16       897       881   54.1077         0         -   1.06554
     57      16       897       881   53.2894         0         -   1.06554
     58      16       897       881   51.9335         0         -   1.06554
     59      16       897       881   51.1792         0         -   1.06554
2015-09-02 09:27:01.267301min lat: 0.01405 max lat: 27.6704 avg lat: 1.06554
    sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
     60      16       897       881   50.4445         0         -   1.06554





     cluster 98d89661-f616-49eb-9ccf-84d720e179c0
      health HEALTH_OK
      monmap e3: 3 mons at
{s01=10.100.50.1:6789/0,s02=10.100.50.2:6789/0,s03=1
<http://10.100.50.1:6789/0,s02=10.100.50.2:6789/0,s03=1>
0.100.50.3:6789/0 <http://0.100.50.3:6789/0>}, election epoch 666,
quorum 0,1,2 s01,s02,s03
*     osdmap e121039: 240 osds: 240 up, 240 in*
       pgmap v850698: 7232 pgs, 31 pools, 439 GB data, 43090 kobjects
             2635 GB used, 867 TB / 870 TB avail
                 7226 active+clean
                    6 active+clean+scrubbing+deep

Note the last line there.  You'll likely want to try your test again when scrubbing is complete.  Also, you may want to try this script:

Yeah i have tried few times when cluster is perfectly healthy ( not doing scrubbing / repairs )
 

https://github.com/ceph/cbt/blob/master/tools/readpgdump.py

You can invoke it like:

ceph pg dump | ./readpgdump.py

That will give you a bunch of information about the pools on your system.  I'm a little concerned about how many PGs your glance-test pool may have given your totals above.

Thanks for the link i would do that and also run rados bench for other pools ( where PG is higher )


Now here are my some observations 

1#  When the cluster is not doing anything , Health_ok , with no background scrubbing / repairing. Also all system resources CPU/MEM/NET are mostly idle. In this Case when i start rados bench ( write / rand / seq ) , after suddenly a few seconds
        
       --- rados bench output drops from ~500M to few 10M
      --- At the same time CPU busy 90%  and System load bumps UP

Once rados bench completes

     --- After few minutes System resources  becomes IDLE 
 
2#   Sometime some PG becomes unclean for a few minutes while rados bench runs and then then quickly they becomes active+clean


I am out of clues , so any help from community that leads me to think in right direction , would be helpful.


- Vickey -
 





_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




--
Ian R. Colle
Global Director of Software Engineering
Red Hat, Inc.
icolle@xxxxxxxxxx
+1-303-601-7713
http://www.linkedin.com/in/ircolle
http://www.twitter.com/ircolle
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux