Re: rbd performance issue - can't find bottleneck

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 17 Jun 2015 16:03:17 +0200 Jacek Jarosiewicz wrote:

> On 06/17/2015 03:34 PM, Mark Nelson wrote:
> > On 06/17/2015 04:10 AM, Jacek Jarosiewicz wrote:
> >> Hi,
> >>
> 
> [ cut ]
> 
> >>
> >> ~60MB/s seq writes
> >> ~100MB/s seq reads
> >> ~2-3k iops random reads
> >
> > Is this per SSD or aggregate?
> 
> aggregate (if I understand You correctly). This is what I see when I run 
> tests on client - a mapped and mounted rbd.
> 
> >
> >>
> >> The client is an rbd mounted on a linux ubuntu box. All the servers
> >> (osd nodes and the client) are running Ubuntu Server 14.04. We tried
> >> to switch to CentOS 7 - but the results are the same.
> >
> > Is this kernel RBD or a VM using QEMU/KVM?  You might want to try fio
> > with the librbd engine and see if you get the same results.  Also,
> > radosbench isn't exactly analogous, but you might try some large
> > sequential write / sequential read tests just as a sanity check.
> >
> 
> This is kernel rbd - testing performance on vm's will be the next step.
> I've tried fio with librbd, but the results were similar.
> I'll run ther radosbench tests and post my results.
> 
Kernel tends to be less then stellar, but probably not your main problem.

> >>
> >> Here are some technical details about our setup:
> >>
> >> Four exact same osd nodes:
> >> E5-1630 CPU
> >> 32 GB RAM
> >> Mellanox MT27520 56Gbps network cards
> >> SATA controller LSI Logic SAS3008
> >
> > Specs look fine.
> >
> >>
> >> Storage nodes are connected to SuperMicro chassis: 847E1C-R1K28JBOD
> >
> > Is that where the SSDs live?  I'm not a fan of such heavy expander
> > over-subscription, but if you are getting good results outside of Ceph
> > I'm guessing it's something else.
> >
> 
> No, the SSD's are connected to the integrated intel sata controller 
> (C610/X99)
> 
> The only disks that reside in the SuperMicro chasis are the SATA drives. 
> And on the last tests I don't use them - the results I gave are on SSD's 
> only (one SSD serves as OSD and the journal is on another SSD).
> 
> >>
> >> Four monitors (one on each node). We do not use CephFS so we do not
> >> run ceph-mds.
> >
> > You'll want to go down to 3 or up to 5.  Even numbers of monitors don't
> > really help you in any way (and can actually hurt).  I'd suggest 3.
> >
> 
> OK, will do that, thanks!
> 
> >
> > You didn't mention the brand/model of SSDs.  Especially for writes this
> > is important as ceph journal writes are O_DSYNC.  Drives that have
> > proper write loss protection often can ignore ATA_CMD_FLUSH and do
> > these very quickly while other drives may need to flush to the flash
> > cells. Also, keep in mind for writes that if you have journals on the
> > SSDs and 3X replication, you'll be doing 6 writes for every client
> > write.
> >
> 
> SSD's are INTEL SSDSC2BW240A4

Intel, they make great SSDs. 
And horrid product numbers in SMART to go with their differently
marketed/named devices.

Amyway, those are likely your problem, see:
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-November/035695.html

Or any google result with "Ceph intel 530" probably.

When you run those tests, did you use atop or iostat to watch the SSD
utilization?

Christian

> The rbd pool is set to have min_size 1 and size 2.
> 
> > For reads and read IOPs on SSDs, you might try disabling in-memory
> > logging and ceph authentication.  You might be interested in some
> > testing we did on a variety of SSDs here:
> >
> > http://www.spinics.net/lists/ceph-users/msg15733.html
> >
> 
> Will read up on that too, thanks!
> 
> J
> 


-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Global OnLine Japan/Fusion Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux