Re: Ceph read / write : Terrible performance

Robert LeBlanc <robert@xxxxxxxxxxxxx> · Thu, 3 Sep 2015 11:35:35 -0600



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Just about how funny "ceph problems" are fixed by changing network
configurations.
- ----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Thu, Sep 3, 2015 at 11:16 AM, Ian Colle  wrote:
> Am I the only one who finds it funny that the "ceph problem" was fixed by an
> update to the disk controller firmware? :-)
>
> Ian
>
> On Thu, Sep 3, 2015 at 11:13 AM, Vickey Singh
> wrote:
>>
>> Hey Mark / Community
>>
>> These are the sequences of changes that seems to have fixed the ceph
>> problem
>>
>> 1#  Upgrading Disk controller firmware from 6.34 to 6.64  ( latest )
>> 2# Rebooting all nodes in order to make new firmware into effect
>>
>> Read and write operations are now normal as well as system load and CPU
>> utilization
>>
>> - Vickey -
>>
>>
>> On Wed, Sep 2, 2015 at 11:28 PM, Vickey Singh
>>  wrote:
>>>
>>> Thank You Mark , please see my response below.
>>>
>>> On Wed, Sep 2, 2015 at 5:23 PM, Mark Nelson  wrote:
>>>>
>>>> On 09/02/2015 08:51 AM, Vickey Singh wrote:
>>>>>
>>>>> Hello Ceph Experts
>>>>>
>>>>> I have a strange problem , when i am reading or writing to Ceph pool ,
>>>>> its not writing properly. Please notice Cur MB/s which is going up and
>>>>> down
>>>>>
>>>>> --- Ceph Hammer 0.94.2
>>>>> -- CentOS 6, 2.6
>>>>> -- Ceph cluster is healthy
>>>>
>>>>
>>>> You might find that CentOS7 gives you better performance.  In some cases
>>>> we were seeing nearly 2X.
>>>
>>>
>>> Wooo 2X , i would definitely plan for upgrade. Thanks
>>>
>>>>
>>>>
>>>>
>>>>>
>>>>>
>>>>> One interesting thing is when every i start rados bench command for
>>>>> read
>>>>> or write CPU Idle % goes down ~10 and System load is increasing like
>>>>> anything.
>>>>>
>>>>> Hardware
>>>>>
>>>>> HpSL4540
>>>>
>>>>
>>>> Please make sure the controller is on the newest firmware.  There used
>>>> to be a bug that would cause sequential write performance to bottleneck when
>>>> writeback cache was enabled on the RAID controller.
>>>
>>>
>>> Last month i have upgraded the firmwares for this hardware , so i hope
>>> they are up to date.
>>>
>>>>
>>>>
>>>>
>>>>> 32Core CPU
>>>>> 196G Memory
>>>>> 10G Network
>>>>
>>>>
>>>> Be sure to check the network too.  We've seen a lot of cases where folks
>>>> have been burned by one of the NICs acting funky.
>>>
>>>
>>> At a first view , Interface looks good and they are pushing data nicely (
>>> what ever they are getting )
>>>
>>>>
>>>>
>>>>>
>>>>> I don't think hardware is a problem.
>>>>>
>>>>> Please give me clues / pointers , how should i troubleshoot this
>>>>> problem.
>>>>>
>>>>>
>>>>>
>>>>> # rados bench -p glance-test 60 write
>>>>>   Maintaining 16 concurrent writes of 4194304 bytes for up to 60
>>>>> seconds
>>>>> or 0 objects
>>>>>   Object prefix: benchmark_data_pouta-s01.pouta.csc.fi_2173350
>>>>>     sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg
>>>>> lat
>>>>>       0       0         0         0         0         0         -
>>>>> 0
>>>>>       1      16        20         4     15.99        16   0.12308
>>>>> 0.10001
>>>>>       2      16        37        21   41.9841        68   1.79104
>>>>> 0.827021
>>>>>       3      16        68        52   69.3122       124  0.084304
>>>>> 0.854829
>>>>>       4      16       114        98   97.9746       184   0.12285
>>>>> 0.614507
>>>>>       5      16       188       172   137.568       296  0.210669
>>>>> 0.449784
>>>>>       6      16       248       232   154.634       240  0.090418
>>>>> 0.390647
>>>>>       7      16       305       289    165.11       228  0.069769
>>>>> 0.347957
>>>>>       8      16       331       315   157.471       104  0.026247
>>>>> 0.3345
>>>>>       9      16       361       345   153.306       120  0.082861
>>>>> 0.320711
>>>>>      10      16       380       364   145.575        76  0.027964
>>>>> 0.310004
>>>>>      11      16       393       377   137.067        52   3.73332
>>>>> 0.393318
>>>>>      12      16       448       432   143.971       220  0.334664
>>>>> 0.415606
>>>>>      13      16       476       460   141.508       112  0.271096
>>>>> 0.406574
>>>>>      14      16       497       481   137.399        84  0.257794
>>>>> 0.412006
>>>>>      15      16       507       491   130.906        40   1.49351
>>>>> 0.428057
>>>>>      16      16       529       513   115.042        88  0.399384
>>>>> 0.48009
>>>>>      17      16       533       517   94.6286        16   5.50641
>>>>> 0.507804
>>>>>      18      16       537       521    83.405        16   4.42682
>>>>> 0.549951
>>>>>      19      16       538       522    80.349         4   11.2052
>>>>> 0.570363
>>>>> 2015-09-02 09:26:18.398641min lat: 0.023851 max lat: 11.2052 avg lat:
>>>>> 0.570363
>>>>>     sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg
>>>>> lat
>>>>>      20      16       538       522   77.3611         0         -
>>>>> 0.570363
>>>>>      21      16       540       524   74.8825         4   8.88847
>>>>> 0.591767
>>>>>      22      16       542       526   72.5748         8   1.41627
>>>>> 0.593555
>>>>>      23      16       543       527   70.2873         4    8.0856
>>>>> 0.607771
>>>>>      24      16       555       539   69.5674        48  0.145199
>>>>> 0.781685
>>>>>      25      16       560       544   68.0177        20    1.4342
>>>>> 0.787017
>>>>>      26      16       564       548   66.4241        16  0.451905
>>>>> 0.78765
>>>>>      27      16       566       550   64.7055         8  0.611129
>>>>> 0.787898
>>>>>      28      16       570       554   63.3138        16   2.51086
>>>>> 0.797067
>>>>>      29      16       570       554   61.5549         0         -
>>>>> 0.797067
>>>>>      30      16       572       556   60.1071         4   7.71382
>>>>> 0.830697
>>>>>      31      16       577       561   59.0515        20   23.3501
>>>>> 0.916368
>>>>>      32      16       590       574   58.8705        52  0.336684
>>>>> 0.956958
>>>>>      33      16       591       575   57.4986         4   1.92811
>>>>> 0.958647
>>>>>      34      16       591       575   56.0961         0         -
>>>>> 0.958647
>>>>>      35      16       591       575   54.7603         0         -
>>>>> 0.958647
>>>>>      36      16       597       581   54.0447         8  0.187351
>>>>> 1.00313
>>>>>      37      16       625       609   52.8394       112   2.12256
>>>>> 1.09256
>>>>>      38      16       631       615    52.227        24   1.57413
>>>>> 1.10206
>>>>>      39      16       638       622   51.7232        28   4.41663
>>>>> 1.15086
>>>>> 2015-09-02 09:26:40.510623min lat: 0.023851 max lat: 27.6704 avg lat:
>>>>> 1.15657
>>>>>     sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg
>>>>> lat
>>>>>      40      16       652       636   51.8102        56  0.113345
>>>>> 1.15657
>>>>>      41      16       682       666   53.1443       120  0.041251
>>>>> 1.17813
>>>>>      42      16       685       669   52.3395        12  0.501285
>>>>> 1.17421
>>>>>      43      15       690       675   51.7955        24   2.26605
>>>>> 1.18357
>>>>>      44      16       728       712   53.6062       148  0.589826
>>>>> 1.17478
>>>>>      45      16       728       712   52.6158         0         -
>>>>> 1.17478
>>>>>      46      16       728       712   51.6613         0         -
>>>>> 1.17478
>>>>>      47      16       728       712   50.7407         0         -
>>>>> 1.17478
>>>>>      48      16       772       756   52.9332        44  0.234811
>>>>> 1.1946
>>>>>      49      16       835       819   56.3577       252   5.67087
>>>>> 1.12063
>>>>>      50      16       890       874   59.1252       220  0.230806
>>>>> 1.06778
>>>>>      51      16       896       880   58.5409        24  0.382471
>>>>> 1.06121
>>>>>      52      16       896       880   57.5832         0         -
>>>>> 1.06121
>>>>>      53      16       896       880   56.6562         0         -
>>>>> 1.06121
>>>>>      54      16       896       880   55.7587         0         -
>>>>> 1.06121
>>>>>      55      16       897       881   54.9515         1   4.88333
>>>>> 1.06554
>>>>>      56      16       897       881   54.1077         0         -
>>>>> 1.06554
>>>>>      57      16       897       881   53.2894         0         -
>>>>> 1.06554
>>>>>      58      16       897       881   51.9335         0         -
>>>>> 1.06554
>>>>>      59      16       897       881   51.1792         0         -
>>>>> 1.06554
>>>>> 2015-09-02 09:27:01.267301min lat: 0.01405 max lat: 27.6704 avg lat:
>>>>> 1.06554
>>>>>     sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg
>>>>> lat
>>>>>      60      16       897       881   50.4445         0         -
>>>>> 1.06554
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>      cluster 98d89661-f616-49eb-9ccf-84d720e179c0
>>>>>       health HEALTH_OK
>>>>>       monmap e3: 3 mons at
>>>>> {s01=10.100.50.1:6789/0,s02=10.100.50.2:6789/0,s03=1
>>>>>
>>>>> 0.100.50.3:6789/0 }, election epoch 666,
>>>>> quorum 0,1,2 s01,s02,s03
>>>>> *     osdmap e121039: 240 osds: 240 up, 240 in*
>>>>>        pgmap v850698: 7232 pgs, 31 pools, 439 GB data, 43090 kobjects
>>>>>              2635 GB used, 867 TB / 870 TB avail
>>>>>                  7226 active+clean
>>>>>                     6 active+clean+scrubbing+deep
>>>>
>>>>
>>>> Note the last line there.  You'll likely want to try your test again
>>>> when scrubbing is complete.  Also, you may want to try this script:
>>>
>>>
>>> Yeah i have tried few times when cluster is perfectly healthy ( not doing
>>> scrubbing / repairs )
>>>
>>>>
>>>>
>>>> https://github.com/ceph/cbt/blob/master/tools/readpgdump.py
>>>>
>>>> You can invoke it like:
>>>>
>>>> ceph pg dump | ./readpgdump.py
>>>>
>>>> That will give you a bunch of information about the pools on your
>>>> system.  I'm a little concerned about how many PGs your glance-test pool may
>>>> have given your totals above.
>>>
>>>
>>> Thanks for the link i would do that and also run rados bench for other
>>> pools ( where PG is higher )
>>>
>>>
>>> Now here are my some observations
>>>
>>> 1#  When the cluster is not doing anything , Health_ok , with no
>>> background scrubbing / repairing. Also all system resources CPU/MEM/NET are
>>> mostly idle. In this Case when i start rados bench ( write / rand / seq ) ,
>>> after suddenly a few seconds
>>>
>>>        --- rados bench output drops from ~500M to few 10M
>>>       --- At the same time CPU busy 90%  and System load bumps UP
>>>
>>> Once rados bench completes
>>>
>>>      --- After few minutes System resources  becomes IDLE
>>>
>>> 2#   Sometime some PG becomes unclean for a few minutes while rados bench
>>> runs and then then quickly they becomes active+clean
>>>
>>>
>>> I am out of clues , so any help from community that leads me to think in
>>> right direction , would be helpful.
>>>
>>>
>>> - Vickey -
>>>
>>>>
>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> ceph-users mailing list
>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> ceph-users@xxxxxxxxxxxxxx
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>>
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
>
> --
> Ian R. Colle
> Global Director of Software Engineering
> Red Hat, Inc.
> icolle@xxxxxxxxxx
> +1-303-601-7713
> http://www.linkedin.com/in/ircolle
> http://www.twitter.com/ircolle
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

-----BEGIN PGP SIGNATURE-----
Version: Mailvelope v1.0.2
Comment: https://www.mailvelope.com

wsFcBAEBCAAQBQJV6ITkCRDmVDuy+mK58QAABBIQAMPK64BVaFBcBQwmuDCJ
D1OnYG207kV6z3n3nvx4i/YjO5I64NHQLk8U/hh7WjU6+Ku7ej59UGQERLv5
TGJ/oJUbvTdT+KQrpm10ZA5r3YPBGWEW8FVxrvB1eXs1sERdOppdcd1+kz2i
JGbUFwTfcn0D2x3ogUK8EmXv0Z5FCNj65b/asA8heOjXv5+bL2a5Nj6G8VIk
cIXtqrPDRaAyQ9CSvsgGrU5aoGlYf9zKbHxG+EBI4cPM5AZd2HZ+KWcY0oWr
eQeM5QRXsOmscx69Clu27mdNlqP/EOhHy5QMWesr8F8cte/b4TecMXrM4kFs
oSzGu8LjAVOAEYWX86mRobbuZGZ8iPQ7hgYQTRr6JOmLQfFp9id1v1Endou/
IipMTbFbmOjaM/+mmKRwBu/OhIOQPhPVJ/WMHGtfFIJfpI4bgs+fFI7F0mDF
/eNyrrmYMtBOkXToFtopN/utGy0ZBqwpZWq8q81YATdaHF1GRa0PWeDaOSIa
JcvDjF1SUJzW5JcIRWmaC/PdiD8lDUAHF7jqgPjTgzm5CbPPPQZKIHTe7u3S
V/Btq4VpWXbB1vFHmzsio/9n9IC2o2nO+Hxps9JuEM1h7b06XdBwYWR68nR8
laknCfObJCJ+qYa2v6pKCnf87MCjp9QsvhvH7Nl5SEefnBw+R7ln8JlNZTmz
NItC
=CLJd
-----END PGP SIGNATURE-----
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com