Quoting Cody (codeology.lab@xxxxxxxxx): > The Ceph OSD part of the cluster uses 3 identical servers with the > following specifications: > > CPU: 2 x E5-2603 @1.8GHz > RAM: 16GB > Network: 1G port shared for Ceph public and cluster traffics This will hamper throughput a lot. > Journaling device: 1 x 120GB SSD (SATA3, consumer grade) > OSD device: 2 x 2TB 7200rpm spindle (SATA3, consumer grade) OK, let's stop here first: Consumer grade SSD. Percona did a nice writeup about "fsync" speed on consumer grade SSDs [1]. As I don't know what drives you use this might or might not be the issue. > > This is not beefy enough in any way, but I am running for PoC only, > with minimum utilization. > > Ceph-mon and ceph-mgr daemons are hosted on the OpenStack Controller > nodes. Ceph-ansible version is 3.1 and is using Filestore with > non-colocated scenario (1 SSD for every 2 OSDs). Connection speed > among Controllers, Computes, and OSD nodes can reach ~900Mbps tested > using iperf. Why filestore if I may ask? I guess bluestore with bluestore journal on SSD and data on SATA should give you better performance. If the SSDs are suitable for the job at all. What version of Ceph are use using? Metrics can give you a lot of insight. Did you take a look at those? In fFor example Ceph mgr dashboard? > > I followed the Red Hat Ceph 3 benchmarking procedure [1] and received > following results: > > Write Test: > > Total time run: 80.313004 > Total writes made: 17 > Write size: 4194304 > Object size: 4194304 > Bandwidth (MB/sec): 0.846687 > Stddev Bandwidth: 0.320051 > Max bandwidth (MB/sec): 2 > Min bandwidth (MB/sec): 0 > Average IOPS: 0 > Stddev IOPS: 0 > Max IOPS: 0 > Min IOPS: 0 > Average Latency(s): 66.6582 > Stddev Latency(s): 15.5529 > Max latency(s): 80.3122 > Min latency(s): 29.7059 > > Sequencial Read Test: > > Total time run: 25.951049 > Total reads made: 17 > Read size: 4194304 > Object size: 4194304 > Bandwidth (MB/sec): 2.62032 > Average IOPS: 0 > Stddev IOPS: 0 > Max IOPS: 1 > Min IOPS: 0 > Average Latency(s): 24.4129 > Max latency(s): 25.9492 > Min latency(s): 0.117732 > > Random Read Test: > > Total time run: 66.355433 > Total reads made: 46 > Read size: 4194304 > Object size: 4194304 > Bandwidth (MB/sec): 2.77295 > Average IOPS: 0 > Stddev IOPS: 3 > Max IOPS: 27 > Min IOPS: 0 > Average Latency(s): 21.4531 > Max latency(s): 66.1885 > Min latency(s): 0.0395266 > > Apparently, the results are pathetic... > > As I moved on to test block devices, I got a following error message: > > # rbd map image01 --pool testbench --name client.admin > rbd: failed to add secret 'client.admin' to kernel What replication factor are you using? Make sure you have the client.admin keyring on the node you are issuing this command. If you have the keyring present like Ceph expects it to be, then you can omit the --name client.admin. On a monitor node you can extract the admin keyring: ceph auth export client.admin. Put the output of that in /etc/ceph/ceph.client.admin.keyring and this should work. > Any suggestions on the above error and/or debugging would be greatly > appreciated! Gr. Stefan [1]: https://www.percona.com/blog/2018/07/18/why-consumer-ssd-reviews-are-useless-for-database-performance-use-case/ > https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html-single/administration_guide/#benchmarking_performance > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- | BIT BV http://www.bit.nl/ Kamer van Koophandel 09090351 | GPG: 0xD14839C6 +31 318 648 688 / info@xxxxxx _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com