Re: Ceph read / write : Terrible performance

Ian Colle <icolle@xxxxxxxxxx> · Thu, 3 Sep 2015 11:16:03 -0600

Am I the only one who finds it funny that the "ceph problem" was fixed by an update to the disk controller firmware? :-)

Ian

On Thu, Sep 3, 2015 at 11:13 AM, Vickey Singh <vickey.singh22693@xxxxxxxxx> wrote:
Hey Mark / Community
These are the sequences of changes that seems to have fixed the ceph problem

1#  Upgrading Disk controller firmware from 6.34 to 6.64  ( latest )
2# Rebooting all nodes in order to make new firmware into effect

Read and write operations are now normal as well as system load and CPU utilization

- Vickey -

On Wed, Sep 2, 2015 at 11:28 PM, Vickey Singh <vickey.singh22693@xxxxxxxxx> wrote:
Thank You Mark , please see my response below.

On Wed, Sep 2, 2015 at 5:23 PM, Mark Nelson <mnelson@xxxxxxxxxx> wrote:
On 09/02/2015 08:51 AM, Vickey Singh wrote:

Hello Ceph Experts

I have a strange problem , when i am reading or writing to Ceph pool ,

its not writing properly. Please notice Cur MB/s which is going up and down

--- Ceph Hammer 0.94.2

-- CentOS 6, 2.6

-- Ceph cluster is healthy

You might find that CentOS7 gives you better performance.  In some cases we were seeing nearly 2X.

Wooo 2X , i would definitely plan for upgrade. Thanks

One interesting thing is when every i start rados bench command for read

or write CPU Idle % goes down ~10 and System load is increasing like

anything.

Hardware

HpSL4540

Please make sure the controller is on the newest firmware.  There used to be a bug that would cause sequential write performance to bottleneck when writeback cache was enabled on the RAID controller.

Last month i have upgraded the firmwares for this hardware , so i hope they are up to date.

32Core CPU

196G Memory

10G Network

Be sure to check the network too.  We've seen a lot of cases where folks have been burned by one of the NICs acting funky.

At a first view , Interface looks good and they are pushing data nicely ( what ever they are getting )

I don't think hardware is a problem.

Please give me clues / pointers , how should i troubleshoot this problem.

# rados bench -p glance-test 60 write

  Maintaining 16 concurrent writes of 4194304 bytes for up to 60 seconds

or 0 objects

  Object prefix: benchmark_data_pouta-s01.pouta.csc.fi_2173350

    sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat

      0       0         0         0         0         0         -         0

      1      16        20         4     15.99        16   0.12308   0.10001

      2      16        37        21   41.9841        68   1.79104  0.827021

      3      16        68        52   69.3122       124  0.084304  0.854829

      4      16       114        98   97.9746       184   0.12285  0.614507

      5      16       188       172   137.568       296  0.210669  0.449784

      6      16       248       232   154.634       240  0.090418  0.390647

      7      16       305       289    165.11       228  0.069769  0.347957

      8      16       331       315   157.471       104  0.026247    0.3345

      9      16       361       345   153.306       120  0.082861  0.320711

     10      16       380       364   145.575        76  0.027964  0.310004

     11      16       393       377   137.067        52   3.73332  0.393318

     12      16       448       432   143.971       220  0.334664  0.415606

     13      16       476       460   141.508       112  0.271096  0.406574

     14      16       497       481   137.399        84  0.257794  0.412006

     15      16       507       491   130.906        40   1.49351  0.428057

     16      16       529       513   115.042        88  0.399384   0.48009

     17      16       533       517   94.6286        16   5.50641  0.507804

     18      16       537       521    83.405        16   4.42682  0.549951

     19      16       538       522    80.349         4   11.2052  0.570363

2015-09-02 09:26:18.398641min lat: 0.023851 max lat: 11.2052 avg lat:

0.570363

    sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat

     20      16       538       522   77.3611         0         -  0.570363

     21      16       540       524   74.8825         4   8.88847  0.591767

     22      16       542       526   72.5748         8   1.41627  0.593555

     23      16       543       527   70.2873         4    8.0856  0.607771

     24      16       555       539   69.5674        48  0.145199  0.781685

     25      16       560       544   68.0177        20    1.4342  0.787017

     26      16       564       548   66.4241        16  0.451905   0.78765

     27      16       566       550   64.7055         8  0.611129  0.787898

     28      16       570       554   63.3138        16   2.51086  0.797067

     29      16       570       554   61.5549         0         -  0.797067

     30      16       572       556   60.1071         4   7.71382  0.830697

     31      16       577       561   59.0515        20   23.3501  0.916368

     32      16       590       574   58.8705        52  0.336684  0.956958

     33      16       591       575   57.4986         4   1.92811  0.958647

     34      16       591       575   56.0961         0         -  0.958647

     35      16       591       575   54.7603         0         -  0.958647

     36      16       597       581   54.0447         8  0.187351   1.00313

     37      16       625       609   52.8394       112   2.12256   1.09256

     38      16       631       615    52.227        24   1.57413   1.10206

     39      16       638       622   51.7232        28   4.41663   1.15086

2015-09-02 09:26:40.510623min lat: 0.023851 max lat: 27.6704 avg lat:

1.15657

    sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat

     40      16       652       636   51.8102        56  0.113345   1.15657

     41      16       682       666   53.1443       120  0.041251   1.17813

     42      16       685       669   52.3395        12  0.501285   1.17421

     43      15       690       675   51.7955        24   2.26605   1.18357

     44      16       728       712   53.6062       148  0.589826   1.17478

     45      16       728       712   52.6158         0         -   1.17478

     46      16       728       712   51.6613         0         -   1.17478

     47      16       728       712   50.7407         0         -   1.17478

     48      16       772       756   52.9332        44  0.234811    1.1946

     49      16       835       819   56.3577       252   5.67087   1.12063

     50      16       890       874   59.1252       220  0.230806   1.06778

     51      16       896       880   58.5409        24  0.382471   1.06121

     52      16       896       880   57.5832         0         -   1.06121

     53      16       896       880   56.6562         0         -   1.06121

     54      16       896       880   55.7587         0         -   1.06121

     55      16       897       881   54.9515         1   4.88333   1.06554

     56      16       897       881   54.1077         0         -   1.06554

     57      16       897       881   53.2894         0         -   1.06554

     58      16       897       881   51.9335         0         -   1.06554

     59      16       897       881   51.1792         0         -   1.06554

2015-09-02 09:27:01.267301min lat: 0.01405 max lat: 27.6704 avg lat: 1.06554

    sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat

     60      16       897       881   50.4445         0         -   1.06554

     cluster 98d89661-f616-49eb-9ccf-84d720e179c0

      health HEALTH_OK

      monmap e3: 3 mons at

{s01=10.100.50.1:6789/0,s02=10.100.50.2:6789/0,s03=1

<http://10.100.50.1:6789/0,s02=10.100.50.2:6789/0,s03=1>

0.100.50.3:6789/0 <http://0.100.50.3:6789/0>}, election epoch 666,

quorum 0,1,2 s01,s02,s03

*     osdmap e121039: 240 osds: 240 up, 240 in*

       pgmap v850698: 7232 pgs, 31 pools, 439 GB data, 43090 kobjects

             2635 GB used, 867 TB / 870 TB avail

                 7226 active+clean

                    6 active+clean+scrubbing+deep

Note the last line there.  You'll likely want to try your test again when scrubbing is complete.  Also, you may want to try this script:

Yeah i have tried few times when cluster is perfectly healthy ( not doing scrubbing / repairs )

https://github.com/ceph/cbt/blob/master/tools/readpgdump.py

You can invoke it like:

ceph pg dump | ./readpgdump.py

That will give you a bunch of information about the pools on your system.  I'm a little concerned about how many PGs your glance-test pool may have given your totals above.

Thanks for the link i would do that and also run rados bench for other pools ( where PG is higher )

Now here are my some observations 

1#  When the cluster is not doing anything , Health_ok , with no background scrubbing / repairing. Also all system resources CPU/MEM/NET are mostly idle. In this Case when i start rados bench ( write / rand / seq ) , after suddenly a few seconds

       --- rados bench output drops from ~500M to few 10M
      --- At the same time CPU busy 90%  and System load bumps UP

Once rados bench completes

     --- After few minutes System resources  becomes IDLE 

2#   Sometime some PG becomes unclean for a few minutes while rados bench runs and then then quickly they becomes active+clean

I am out of clues , so any help from community that leads me to think in right direction , would be helpful.

- Vickey - 

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 
Ian R. Colle
Global Director of Software Engineering
Red Hat, Inc.
icolle@xxxxxxxxxx
+1-303-601-7713
http://www.linkedin.com/in/ircolle 
http://www.twitter.com/ircolle 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com