Re: Slow rbd read performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

On Thu, 26 Dec 2019 18:11:29 +0100 Ml Ml wrote:

> Hello Christian,
> 
> thanks for your reply. How should i benchmark my OSDs?
>
Benchmarking individual components can be helpful if you suspect
something, but you need to get a grip on what your systems are doing,
re-read my mail and familiarize yourself with atop and other tools like
prometheus and grafana to get that insight. 

> "dd bs=1M count=2048 if=/dev/sdX of=/dev/null" for each OSD?
> 
You'd be comparing apples with oranges again, as the blocksize with the
benches below is 4MB. Also a "direct" flag would exclude caching effects.


> Here are my OSD (write) benchmarks:
>
The variance is significant here, especially if the cluster was quiescent
at the time.
If the low results (<100MB/s) can be reproduced on the same OSDs, you have
at least one problem spot located.
The slowest component (OSD) involved determines the overall performance.

Christian

> root@ceph01:~# ceph tell osd.* bench -f plain
> osd.0: bench: wrote 1GiB in blocks of 4MiB in 7.80794 sec at 131MiB/sec 32 IOPS
> osd.1: bench: wrote 1GiB in blocks of 4MiB in 7.46659 sec at 137MiB/sec 34 IOPS
> osd.2: bench: wrote 1GiB in blocks of 4MiB in 7.59962 sec at 135MiB/sec 33 IOPS
> osd.3: bench: wrote 1GiB in blocks of 4MiB in 4.58729 sec at 223MiB/sec 55 IOPS
> osd.4: bench: wrote 1GiB in blocks of 4MiB in 4.94816 sec at 207MiB/sec 51 IOPS
> osd.5: bench: wrote 1GiB in blocks of 4MiB in 11.7797 sec at 86.9MiB/sec 21 IOPS
> osd.6: bench: wrote 1GiB in blocks of 4MiB in 11.6019 sec at 88.3MiB/sec 22 IOPS
> osd.7: bench: wrote 1GiB in blocks of 4MiB in 8.87174 sec at 115MiB/sec 28 IOPS
> osd.8: bench: wrote 1GiB in blocks of 4MiB in 10.6859 sec at 95.8MiB/sec 23 IOPS
> osd.10: bench: wrote 1GiB in blocks of 4MiB in 12.1083 sec at
> 84.6MiB/sec 21 IOPS
> osd.11: bench: wrote 1GiB in blocks of 4MiB in 6.26344 sec at 163MiB/sec 40 IOPS
> osd.12: bench: wrote 1GiB in blocks of 4MiB in 8.12922 sec at 126MiB/sec 31 IOPS
> osd.13: bench: wrote 1GiB in blocks of 4MiB in 5.5416 sec at 185MiB/sec 46 IOPS
> osd.14: bench: wrote 1GiB in blocks of 4MiB in 4.99461 sec at 205MiB/sec 51 IOPS
> osd.15: bench: wrote 1GiB in blocks of 4MiB in 5.84936 sec at 175MiB/sec 43 IOPS
> osd.16: bench: wrote 1GiB in blocks of 4MiB in 6.72942 sec at 152MiB/sec 38 IOPS
> osd.17: bench: wrote 1GiB in blocks of 4MiB in 10.3651 sec at
> 98.8MiB/sec 24 IOPS
> osd.18: bench: wrote 1GiB in blocks of 4MiB in 8.33947 sec at 123MiB/sec 30 IOPS
> osd.19: bench: wrote 1GiB in blocks of 4MiB in 4.79787 sec at 213MiB/sec 53 IOPS
> osd.20: bench: wrote 1GiB in blocks of 4MiB in 8.11134 sec at 126MiB/sec 31 IOPS
> osd.21: bench: wrote 1GiB in blocks of 4MiB in 5.70753 sec at 179MiB/sec 44 IOPS
> osd.22: bench: wrote 1GiB in blocks of 4MiB in 4.82281 sec at 212MiB/sec 53 IOPS
> osd.23: bench: wrote 1GiB in blocks of 4MiB in 8.04044 sec at 127MiB/sec 31 IOPS
> osd.24: bench: wrote 1GiB in blocks of 4MiB in 4.64409 sec at 220MiB/sec 55 IOPS
> osd.25: bench: wrote 1GiB in blocks of 4MiB in 6.23562 sec at 164MiB/sec 41 IOPS
> osd.27: bench: wrote 1GiB in blocks of 4MiB in 7.00978 sec at 146MiB/sec 36 IOPS
> osd.32: bench: wrote 1GiB in blocks of 4MiB in 6.38438 sec at 160MiB/sec 40 IOPS
> 
> Thanks,
> Mario
> 
> 
> 
> On Tue, Dec 24, 2019 at 1:46 AM Christian Balzer <chibi@xxxxxxx> wrote:
> >
> >
> > Hello,
> >
> > On Mon, 23 Dec 2019 22:14:15 +0100 Ml Ml wrote:
> >  
> > > Hohoho Merry Christmas and Hello,
> > >
> > > i set up a "poor man´s" ceph cluster with 3 Nodes, one switch and
> > > normal standard HDDs.
> > >
> > > My problem; with rbd benchmark i get 190MB/sec write, but only
> > > 45MB/sec read speed.
> > >  
> > Something is severely off with your testing or cluster if reads are slower
> > than writes, especially by this margin.
> >  
> > > Here is the Setup: https://i.ibb.co/QdYkBYG/ceph.jpg
> > >
> > > I plan to implement a separate switch to separate public from cluster
> > > network. But i think this is not my current problem here.
> > >  
> > You don't mention how many HDDs per server, 10Gbs is fine most likely and
> > a separate network (either physical or logical) is usually not needed or
> > beneficial.
> > Your results indicate that the HIGHEST peak used 70% of your bandwidth and
> > that your disks can only maintain 20% of it.
> >
> > Do your tests consistently with the same tool.
> > Neither rados nor rbdbench are ideal, but at least they give ballpark
> > figures.
> > FIO on the actual mount on your backup server would be best.
> >
> > And testing on a ceph node is also prone to skewed results, test from the
> > actual client, your backup server.
> >
> > Make sure your network does what you want and monitor the ceph nodes with
> > ie. atop during the test runs to see where obvious bottlenecks are.
> >
> > Christian
> >  
> > > I mount the stuff with rbd from the backup server. It seems that i get
> > > good write, but slow read speed. More details at the end of the mail.
> > >
> > > rados bench -p scbench 30 write --no-cleanup:
> > > ---------------------------------------------------------------------
> > > Total time run:         34.269336
> > > ...
> > > Bandwidth (MB/sec):     162.945
> > > Stddev Bandwidth:       198.818
> > > Max bandwidth (MB/sec): 764
> > > Min bandwidth (MB/sec): 0
> > > Average IOPS:           40
> > > Stddev IOPS:            49
> > > Max IOPS:               191
> > > Min IOPS:               0
> > > Average Latency(s):     0.387122
> > > Stddev Latency(s):      1.24094
> > > Max latency(s):         11.883
> > > Min latency(s):         0.0161869
> > >
> > >
> > > Here are the rbd benchmarks run on ceph01:
> > > ----------------------------------------------------------------------
> > > rbd -p rbdbench bench $RBD_IMAGE_NAME --io-type write --io-size 8192
> > > --io-threads 256 --io-total 10G --io-pattern seq
> > > ...
> > > elapsed:    56  ops:  1310720  ops/sec: 23295.63  bytes/sec:
> > > 190837820.82 (190MB/sec) => OKAY
> > >
> > >
> > > rbd -p rbdbench bench $RBD_IMAGE_NAME --io-type read --io-size 8192
> > > --io-threads 256 --io-total 10G --io-pattern seq
> > > ...
> > > elapsed:   237  ops:  1310720  ops/sec:  5517.19  bytes/sec:
> > > 45196784.26 (45MB/sec) => WHY JUST 45MB/sec?
> > >
> > > Since i ran those rbd benchmarks in ceph01, i guess the problem is not
> > > related to my backup rbd mount at all?
> > >
> > > Thanks,
> > > Mario
> > > _______________________________________________
> > > ceph-users mailing list
> > > ceph-users@xxxxxxxxxxxxxx
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com  
> >
> >
> > --
> > Christian Balzer        Network/Systems Engineer
> > chibi@xxxxxxx           Rakuten Mobile Inc.  
> 


-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Rakuten Mobile Inc.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux