Dear Ceph community, I have a very small ceph cluster for testing with this configuration: ·
2x compute nodes each with: ·
dual port of 25 nic ·
2x socket (56 cores with hyperthreading) ·
X10 intel nvme DC P3500 drives ·
512 GB RAM One of the nodes is also running as a monitor. Installation has been done using ceph-ansible. Ceph version: jewel Storage engine: filestore Performance test below: [root@zeus-59 ceph-block-device]# ceph osd pool ls detail pool 0 'rbd' replicated size 2 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 115 flags hashpspool stripe_width
0 pool 1 'images' replicated size 2 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 118 flags hashpspool stripe_width
0 removed_snaps [1~3,7~4] pool 3 'backups' replicated size 2 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 120 flags hashpspool stripe_width
0 pool 4 'vms' replicated size 2 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 122 flags hashpspool stripe_width
0 removed_snaps [1~7] pool 5 'volumes' replicated size 2 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 124 flags hashpspool stripe_width
0 removed_snaps [1~3] pool 6 'scbench' replicated size 2 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 100 pgp_num 100 last_change 126 flags hashpspool stripe_width
0 pool 7 'rbdbench' replicated size 2 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 100 pgp_num 100 last_change 128 flags hashpspool stripe_width
0 removed_snaps [1~3] [root@zeus-59 ceph-block-device]# ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 36.17371 root default -2 18.08685 host zeus-58 0 1.80869 osd.0 up 1.00000 1.00000 2 1.80869 osd.2 up 1.00000 1.00000 4 1.80869 osd.4 up 1.00000 1.00000 6 1.80869 osd.6 up 1.00000 1.00000 8 1.80869 osd.8 up 1.00000 1.00000 10 1.80869 osd.10 up 1.00000 1.00000 12 1.80869 osd.12 up 1.00000 1.00000 14 1.80869 osd.14 up 1.00000 1.00000 16 1.80869 osd.16 up 1.00000 1.00000 18 1.80869 osd.18 up 1.00000 1.00000 -3 18.08685 host zeus-59 1 1.80869 osd.1 up 1.00000 1.00000 3 1.80869 osd.3 up 1.00000 1.00000 5 1.80869 osd.5 up 1.00000 1.00000 7 1.80869 osd.7 up 1.00000 1.00000 9 1.80869 osd.9 up 1.00000 1.00000 11 1.80869 osd.11 up 1.00000 1.00000 13 1.80869 osd.13 up 1.00000 1.00000 15 1.80869 osd.15 up 1.00000 1.00000 17 1.80869 osd.17 up 1.00000 1.00000 19 1.80869 osd.19 up 1.00000 1.00000 [root@zeus-59 ceph-block-device]# ceph status cluster 8e930b6c-455e-4328-872d-cb9f5c0359ae health HEALTH_OK monmap e1: 1 mons at {zeus-59=10.0.32.59:6789/0} election epoch 3, quorum 0 zeus-59 osdmap e129: 20 osds: 20 up, 20 in flags sortbitwise,require_jewel_osds pgmap v1166945: 776 pgs, 7 pools, 1183 GB data, 296 kobjects 2363 GB used, 34678 GB / 37042 GB avail 775 active+clean 1 active+clean+scrubbing+deep [root@zeus-59 ceph-block-device]# rados bench -p scbench 10 write --no-cleanup Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 10 seconds or 0 objects Object prefix: benchmark_data_zeus-59.localdomain_2844050 sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s) 0 0 0 0 0 0 - 0 1 16 644 628 2511.4 2512 0.0210273 0.025206 2 16 1319 1303 2605.49 2700 0.0238678 0.0243974 3 16 2003 1987 2648.89 2736 0.0201334 0.0240726 4 16 2669 2653 2652.59 2664 0.0258618 0.0240468 5 16 3349 3333 2666.01 2720 0.0189464 0.0239484 6 16 4026 4010 2672.96 2708 0.02215 0.0238954 7 16 4697 4681 2674.49 2684 0.0217258 0.0238887 8 16 5358 5342 2670.64 2644 0.0265384 0.0239066 9 16 6043 6027 2678.3 2740 0.0260798 0.0238637 10 16 6731 6715 2685.64 2752 0.0174624 0.0237982 Total time run: 10.026091 Total writes made: 6731 Write size: 4194304 Object size: 4194304 Bandwidth (MB/sec): 2685.39 Stddev Bandwidth: 70.0286 Max bandwidth (MB/sec): 2752 Min bandwidth (MB/sec): 2512 Average IOPS: 671 Stddev IOPS: 17 Max IOPS: 688 Min IOPS: 628 Average Latency(s): 0.023819 Stddev Latency(s): 0.00463709 Max latency(s): 0.0594516 Min latency(s): 0.0138556 [root@zeus-59 ceph-block-device]# rados bench -p scbench 10 seq sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s) 0 0 0 0 0 0 - 0 1 15 1150 1135 4498.75 4540 0.0146433 0.0131456 2 15 2313 2298 4571.38 4652 0.0144489 0.0131564 3 15 3468 3453 4585.68 4620 0.00975626 0.0131211 4 15 4663 4648 4633.41 4780 0.0163181 0.0130076 5 15 5949 5934 4734.49 5144 0.00944718 0.0127327 Total time run: 5.643929 Total reads made: 6731 Read size: 4194304 Object size: 4194304 Bandwidth (MB/sec): 4770.43 Average IOPS 1192 Stddev IOPS: 59 Max IOPS: 1286 Min IOPS: 1135 Average Latency(s): 0.0126349 Max latency(s): 0.0490061 Min latency(s): 0.00613382 [root@zeus-59 ceph-block-device]# rados bench -p scbench 10 rand sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s) 0 0 0 0 0 0 - 0 1 15 1197 1182 4726.8 4728 0.0130331 0.012711 2 15 2364 2349 4697.02 4668 0.0105971 0.0128123 3 15 3686 3671 4893.78 5288 0.00906867 0.0123103 4 15 4994 4979 4978.16 5232 0.00946901 0.012104 5 15 6302 6287 5028.83 5232 0.0115159 0.0119879 6 15 7620 7605 5069.28 5272 0.00986636 0.0118935 7 15 8912 8897 5083.31 5168 0.0106201 0.0118648 8 15 10185 10170 5084.34 5092 0.0116891 0.0118632 9 15 11484 11469 5096.68 5196 0.00911787 0.0118354 10 16 12748 12732 5092.16 5052 0.0111988 0.0118476 Total time run: 10.020135 Total reads made: 12748 Read size: 4194304 Object size: 4194304 Bandwidth (MB/sec): 5088.95 Average IOPS: 1272 Stddev IOPS: 55 Max IOPS: 1322 Min IOPS: 1167 Average Latency(s): 0.0118531 Max latency(s): 0.0441046 Min latency(s): 0.00590162 [root@zeus-59 ceph-block-device]# rbd bench-write image01 --pool=rbdbench bench-write io_size 4096 io_threads 16 bytes 1073741824 pattern sequential SEC OPS OPS/SEC BYTES/SEC 1 56159 56180.51 230115361.66 2 119975 59998.28 245752967.01 3 182956 60990.78 249818235.33 4 244195 61054.17 250077889.88 elapsed: 4 ops: 262144 ops/sec: 60006.56 bytes/sec: 245786880.86 [root@zeus-59 ceph-block-device]# I am far from a ceph/storage expert but my feeling is that the numbers provided by rbd bench-write are quite poor considering the hardware I am using (please correct me if I am wrong). I would like to ask for some help from the community in order to dig into this issue and find what is throttling the performance (cpu? Memory? Network configuration? Not enough data nodes? Not enough OSDs per disk? Cpu pinning? Etc.).
Apologies beforehand as I know this is a quite a broad topic and not easy to give an exact answer but I would like to have some guidance and hope we can make an interesting topic for performance troubleshooting for other people who is learning
distributed storage and ceph. Thank you very much Manuel Sopena Ballesteros | Systems engineer NOTICE
Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are
not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution
of viruses or similar in electronic communications. This notice should not be removed.
|
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com