thanks, this looks really helpful and it proves me that I am not doing the right way. And you had the hit the nail by asking about *replication factor*. Because I don't know how to change the replication factor. AFAIK, by default it is *3x*. But I would like to change, for example to* 2x*. So please excuse me for two naive questions before my cluster info [1]: - How can I change my replication factor? I am assuming I can change it through vstart script. - How can I change ethernet speed on test cluster? For example, 1gbit ethernet and 10gbit ethernet. Like you had done it. Assuming I can change it through vstart script. [1] I am running a minimal cluster of 4 OSDs . I am passing following shell parameters for vstart.sh: MDS=1 RGW=1 MON=1 OSD=4 ../src/vstart.sh -d -l -n --bluestore cluster: id: fce9b3c6-2814-4df2-a5e5-ee0d001a8f4f health: HEALTH_OK services: mon: 1 daemons, quorum a (age 4m) mgr: x(active, since 4m) osd: 4 osds: 4 up (since 3m), 4 in (since 3m) rgw: 1 daemon active (8000) data: pools: 5 pools, 112 pgs objects: 329 objects, 27 KiB usage: 4.0 GiB used, 400 GiB / 404 GiB avail pgs: 0.893% pgs not active 111 active+clean 1 peering On Wed, Feb 10, 2021 at 10:47 AM Marc <Marc@xxxxxxxxxxxxxxxxx> wrote: > You have to tell a bit about your cluster setup, like nr of osd's, 3x > replication on your testing pool? > > Eg. this[1] was my test on a cluster with only 1gbit ethernet, 3x repl hdd > pool. This[2] with 10gbit and more osd's added > > [2] > [root@c01 ~]# rados bench -p rbd 10 write > hints = 1 > Maintaining 16 concurrent writes of 4194304 bytes to objects of size > 4194304 for up to 10 seconds or 0 objects > Object prefix: benchmark_data_c01_3576497 > sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg > lat(s) > 0 0 0 0 0 0 - > 0 > 1 16 41 25 99.9948 100 0.198773 > 0.41148 > 2 16 101 85 169.984 240 0.203578 > 0.347027 > 3 16 172 156 207.979 284 0.0863202 > 0.296866 > 4 16 245 229 228.975 292 0.139681 > 0.268933 > 5 16 322 306 244.772 308 0.107296 > 0.257353 > 6 16 385 369 245.97 252 0.601879 > 0.250782 > 7 16 460 444 253.684 300 0.154803 > 0.247178 > 8 16 541 525 262.467 324 0.274302 > 0.241951 > 9 16 604 588 261.3 252 0.11929 > 0.238717 > 10 16 672 656 262.367 272 0.134654 > 0.241424 > Total time run: 10.1504 > Total writes made: 673 > Write size: 4194304 > Object size: 4194304 > Bandwidth (MB/sec): 265.212 > Stddev Bandwidth: 63.0823 > Max bandwidth (MB/sec): 324 > Min bandwidth (MB/sec): 100 > Average IOPS: 66 > Stddev IOPS: 15.7706 > Max IOPS: 81 > Min IOPS: 25 > Average Latency(s): 0.241012 > Stddev Latency(s): 0.154282 > Max latency(s): 1.05851 > Min latency(s): 0.0702826 > Cleaning up (deleting benchmark objects) > Removed 673 objects > Clean up completed and total clean up time :1.26346 > > [1] > [@]# rados bench -p rbd 10 write --no-cleanup > hints = 1 > Maintaining 16 concurrent writes of 4194304 bytes to objects of size > 4194304 for up to 10 seconds or 0 objects > Object prefix: benchmark_data_c01_18283 > sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg > lat(s) > 0 0 0 0 0 0 - > 0 > 1 16 27 11 43.9884 44 0.554119 > 0.624979 > 2 16 47 31 61.9841 80 1.04112 > 0.793553 > 3 16 57 41 54.654 40 1.33104 > 0.876273 > 4 16 75 59 58.9869 72 0.840098 > 0.97091 > 5 16 97 81 64.7864 88 1.02915 > 0.922043 > 6 16 105 89 59.3207 32 1.2471 > 0.915408 > 7 16 129 113 64.5582 96 0.616579 > 0.947882 > 8 16 145 129 64.4866 64 1.09397 > 0.921441 > 9 16 163 147 65.3201 72 0.885566 > 0.906388 > 10 16 166 150 59.9881 12 1.22834 > 0.909591 > 11 13 167 154 55.9889 16 2.30029 > 0.942798 > Total time run: 11.141939 > Total writes made: 167 > Write size: 4194304 > Object size: 4194304 > Bandwidth (MB/sec): 59.9537 > Stddev Bandwidth: 28.7889 > Max bandwidth (MB/sec): 96 > Min bandwidth (MB/sec): 12 > Average IOPS: 14 > Stddev IOPS: 7 > Max IOPS: 24 > Min IOPS: 3 > Average Latency(s): 1.06157 > Stddev Latency(s): 0.615773 > Max latency(s): 3.23088 > Min latency(s): 0.171585 > > > > > -----Original Message----- > > Sent: 10 February 2021 10:14 > > To: Marc <Marc@xxxxxxxxxxxxxxxxx> > > Cc: ceph-users <ceph-users@xxxxxxx> > > Subject: Re: struggling to achieve high bandwidth on Ceph > > dev cluster - HELP > > > > thanks for the reply. > > > > Yes, 4MB is the default. I have tried it. For example below (posted) is > > for > > 4MB (default) ran for 600 seconds. The seq read and rand read gives me a > > good bandwidth (not posted here). But with write its still very less. > > And I > > am particularly interested in block sizes. And rados bench tool has > > block > > size option which I have been using. > > > > Total time run: 601.106 > > Total writes made: 2966 > > Write size: 4194304 > > Object size: 4194304 > > Bandwidth (MB/sec): 19.7369 > > Stddev Bandwidth: 14.8408 > > Max bandwidth (MB/sec): 64 > > Min bandwidth (MB/sec): 0 > > Average IOPS: 4 > > Stddev IOPS: 3.67408 > > Max IOPS: 16 > > Min IOPS: 0 > > Average Latency(s): 3.24064 > > Stddev Latency(s): 2.75111 > > Max latency(s): 42.4551 > > Min latency(s): 0.167701 > > > > On Wed, Feb 10, 2021 at 9:46 AM Marc <Marc@xxxxxxxxxxxxxxxxx> wrote: > > > > > > > > try 4MB that is the default not? > > > > > > > > > > > > > -----Original Message----- > > > > Sent: 10 February 2021 09:30 > > > > To: ceph-users <ceph-users@xxxxxxx>; dev <dev@xxxxxxx>; ceph- > > qa@xxxxxxx > > > > Subject: struggling to achieve high bandwidth on Ceph > > dev > > > > cluster - HELP > > > > > > > > Hi, > > > > > > > > Hello I am using rados bench tool. Currently I am using this tool > > on > > > > the > > > > development cluster after running vstart.sh script. It is working > > fine > > > > and > > > > I am interested in benchmarking the cluster. However I am struggling > > to > > > > achieve a good bandwidth i.e. bandwidth (MB/sec). My target > > throughput > > > > is > > > > at least 50 MB/sec and more. But mostly I am achieving is around 15- > > 20 > > > > MB/sec. So, very poor. > > > > > > > > I am quite sure I am missing something. Either I have to change my > > > > cluster > > > > through vstart.sh script or I am not fully utilizing the rados bench > > > > tool. > > > > Or may be both. i.e. not the right cluster and also not using the > > rados > > > > bench tool correctly. > > > > > > > > Some of the shell examples I have been using to build the cluster > > are > > > > bellow: > > > > MDS=0 RGW=1 ../src/vstart.sh -d -l -n --bluestore > > > > MDS=0 RGW=1 MON=1 OSD=4../src/vstart.sh -d -l -n --bluestore > > > > > > > > While using rados bench tool I have been trying with different block > > > > sizes > > > > 4K, 8K, 16K, 32K, 64K, 128K, 256K, 512K. And I have also been > > changing > > > > the > > > > -t parameter in the shell to increase concurrent IOs. > > > > > > > > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx