It's always possible it was the reboot (seriously!) :) Mark On 09/03/2015 12:16 PM, Ian Colle wrote:
Am I the only one who finds it funny that the "ceph problem" was fixed by an update to the disk controller firmware? :-) Ian On Thu, Sep 3, 2015 at 11:13 AM, Vickey Singh <vickey.singh22693@xxxxxxxxx <mailto:vickey.singh22693@xxxxxxxxx>> wrote: Hey Mark / Community These are the sequences of changes that seems to have fixed the ceph problem 1# Upgrading Disk controller firmware from 6.34 to 6.64 ( latest ) 2# Rebooting all nodes in order to make new firmware into effect Read and write operations are now normal as well as system load and CPU utilization - Vickey - On Wed, Sep 2, 2015 at 11:28 PM, Vickey Singh <vickey.singh22693@xxxxxxxxx <mailto:vickey.singh22693@xxxxxxxxx>> wrote: Thank You Mark , please see my response below. On Wed, Sep 2, 2015 at 5:23 PM, Mark Nelson <mnelson@xxxxxxxxxx <mailto:mnelson@xxxxxxxxxx>> wrote: On 09/02/2015 08:51 AM, Vickey Singh wrote: Hello Ceph Experts I have a strange problem , when i am reading or writing to Ceph pool , its not writing properly. Please notice Cur MB/s which is going up and down --- Ceph Hammer 0.94.2 -- CentOS 6, 2.6 -- Ceph cluster is healthy You might find that CentOS7 gives you better performance. In some cases we were seeing nearly 2X. Wooo 2X , i would definitely plan for upgrade. Thanks One interesting thing is when every i start rados bench command for read or write CPU Idle % goes down ~10 and System load is increasing like anything. Hardware HpSL4540 Please make sure the controller is on the newest firmware. There used to be a bug that would cause sequential write performance to bottleneck when writeback cache was enabled on the RAID controller. Last month i have upgraded the firmwares for this hardware , so i hope they are up to date. 32Core CPU 196G Memory 10G Network Be sure to check the network too. We've seen a lot of cases where folks have been burned by one of the NICs acting funky. At a first view , Interface looks good and they are pushing data nicely ( what ever they are getting ) I don't think hardware is a problem. Please give me clues / pointers , how should i troubleshoot this problem. # rados bench -p glance-test 60 write Maintaining 16 concurrent writes of 4194304 bytes for up to 60 seconds or 0 objects Object prefix: benchmark_data_pouta-s01.pouta.csc.fi_2173350 sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 0 0 0 0 0 0 - 0 1 16 20 4 15.99 16 0.12308 0.10001 2 16 37 21 41.9841 68 1.79104 0.827021 3 16 68 52 69.3122 124 0.084304 0.854829 4 16 114 98 97.9746 184 0.12285 0.614507 5 16 188 172 137.568 296 0.210669 0.449784 6 16 248 232 154.634 240 0.090418 0.390647 7 16 305 289 165.11 228 0.069769 0.347957 8 16 331 315 157.471 104 0.026247 0.3345 9 16 361 345 153.306 120 0.082861 0.320711 10 16 380 364 145.575 76 0.027964 0.310004 11 16 393 377 137.067 52 3.73332 0.393318 12 16 448 432 143.971 220 0.334664 0.415606 13 16 476 460 141.508 112 0.271096 0.406574 14 16 497 481 137.399 84 0.257794 0.412006 15 16 507 491 130.906 40 1.49351 0.428057 16 16 529 513 115.042 88 0.399384 0.48009 17 16 533 517 94.6286 16 5.50641 0.507804 18 16 537 521 83.405 16 4.42682 0.549951 19 16 538 522 80.349 4 11.2052 0.570363 2015-09-02 09:26:18.398641min lat: 0.023851 max lat: 11.2052 avg lat: 0.570363 sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 20 16 538 522 77.3611 0 - 0.570363 21 16 540 524 74.8825 4 8.88847 0.591767 22 16 542 526 72.5748 8 1.41627 0.593555 23 16 543 527 70.2873 4 8.0856 0.607771 24 16 555 539 69.5674 48 0.145199 0.781685 25 16 560 544 68.0177 20 1.4342 0.787017 26 16 564 548 66.4241 16 0.451905 0.78765 27 16 566 550 64.7055 8 0.611129 0.787898 28 16 570 554 63.3138 16 2.51086 0.797067 29 16 570 554 61.5549 0 - 0.797067 30 16 572 556 60.1071 4 7.71382 0.830697 31 16 577 561 59.0515 20 23.3501 0.916368 32 16 590 574 58.8705 52 0.336684 0.956958 33 16 591 575 57.4986 4 1.92811 0.958647 34 16 591 575 56.0961 0 - 0.958647 35 16 591 575 54.7603 0 - 0.958647 36 16 597 581 54.0447 8 0.187351 1.00313 37 16 625 609 52.8394 112 2.12256 1.09256 38 16 631 615 52.227 24 1.57413 1.10206 39 16 638 622 51.7232 28 4.41663 1.15086 2015-09-02 09:26:40.510623min lat: 0.023851 max lat: 27.6704 avg lat: 1.15657 sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 40 16 652 636 51.8102 56 0.113345 1.15657 41 16 682 666 53.1443 120 0.041251 1.17813 42 16 685 669 52.3395 12 0.501285 1.17421 43 15 690 675 51.7955 24 2.26605 1.18357 44 16 728 712 53.6062 148 0.589826 1.17478 45 16 728 712 52.6158 0 - 1.17478 46 16 728 712 51.6613 0 - 1.17478 47 16 728 712 50.7407 0 - 1.17478 48 16 772 756 52.9332 44 0.234811 1.1946 49 16 835 819 56.3577 252 5.67087 1.12063 50 16 890 874 59.1252 220 0.230806 1.06778 51 16 896 880 58.5409 24 0.382471 1.06121 52 16 896 880 57.5832 0 - 1.06121 53 16 896 880 56.6562 0 - 1.06121 54 16 896 880 55.7587 0 - 1.06121 55 16 897 881 54.9515 1 4.88333 1.06554 56 16 897 881 54.1077 0 - 1.06554 57 16 897 881 53.2894 0 - 1.06554 58 16 897 881 51.9335 0 - 1.06554 59 16 897 881 51.1792 0 - 1.06554 2015-09-02 09:27:01.267301min lat: 0.01405 max lat: 27.6704 avg lat: 1.06554 sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 60 16 897 881 50.4445 0 - 1.06554 cluster 98d89661-f616-49eb-9ccf-84d720e179c0 health HEALTH_OK monmap e3: 3 mons at {s01=10.100.50.1:6789/0,s02=10.100.50.2:6789/0,s03=1 <http://10.100.50.1:6789/0,s02=10.100.50.2:6789/0,s03=1> <http://10.100.50.1:6789/0,s02=10.100.50.2:6789/0,s03=1> 0.100.50.3:6789/0 <http://0.100.50.3:6789/0> <http://0.100.50.3:6789/0>}, election epoch 666, quorum 0,1,2 s01,s02,s03 * osdmap e121039: 240 osds: 240 up, 240 in* pgmap v850698: 7232 pgs, 31 pools, 439 GB data, 43090 kobjects 2635 GB used, 867 TB / 870 TB avail 7226 active+clean 6 active+clean+scrubbing+deep Note the last line there. You'll likely want to try your test again when scrubbing is complete. Also, you may want to try this script: Yeah i have tried few times when cluster is perfectly healthy ( not doing scrubbing / repairs ) https://github.com/ceph/cbt/blob/master/tools/readpgdump.py You can invoke it like: ceph pg dump | ./readpgdump.py That will give you a bunch of information about the pools on your system. I'm a little concerned about how many PGs your glance-test pool may have given your totals above. Thanks for the link i would do that and also run rados bench for other pools ( where PG is higher ) Now here are my some observations 1# When the cluster is not doing anything , Health_ok , with no background scrubbing / repairing. Also all system resources CPU/MEM/NET are mostly idle. In this Case when i start rados bench ( write / rand / seq ) , after suddenly a few seconds --- rados bench output drops from ~500M to few 10M --- At the same time CPU busy 90% and System load bumps UP Once rados bench completes --- After few minutes System resources becomes IDLE 2# Sometime some PG becomes unclean for a few minutes while rados bench runs and then then quickly they becomes active+clean I am out of clues , so any help from community that leads me to think in right direction , would be helpful. - Vickey - _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Ian R. Colle Global Director of Software Engineering Red Hat, Inc. icolle@xxxxxxxxxx <mailto:icolle@xxxxxxxxxx> +1-303-601-7713 http://www.linkedin.com/in/ircolle http://www.twitter.com/ircolle
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com