On Wed, 17 Jun 2015 16:03:17 +0200 Jacek Jarosiewicz wrote: > On 06/17/2015 03:34 PM, Mark Nelson wrote: > > On 06/17/2015 04:10 AM, Jacek Jarosiewicz wrote: > >> Hi, > >> > > [ cut ] > > >> > >> ~60MB/s seq writes > >> ~100MB/s seq reads > >> ~2-3k iops random reads > > > > Is this per SSD or aggregate? > > aggregate (if I understand You correctly). This is what I see when I run > tests on client - a mapped and mounted rbd. > > > > >> > >> The client is an rbd mounted on a linux ubuntu box. All the servers > >> (osd nodes and the client) are running Ubuntu Server 14.04. We tried > >> to switch to CentOS 7 - but the results are the same. > > > > Is this kernel RBD or a VM using QEMU/KVM? You might want to try fio > > with the librbd engine and see if you get the same results. Also, > > radosbench isn't exactly analogous, but you might try some large > > sequential write / sequential read tests just as a sanity check. > > > > This is kernel rbd - testing performance on vm's will be the next step. > I've tried fio with librbd, but the results were similar. > I'll run ther radosbench tests and post my results. > Kernel tends to be less then stellar, but probably not your main problem. > >> > >> Here are some technical details about our setup: > >> > >> Four exact same osd nodes: > >> E5-1630 CPU > >> 32 GB RAM > >> Mellanox MT27520 56Gbps network cards > >> SATA controller LSI Logic SAS3008 > > > > Specs look fine. > > > >> > >> Storage nodes are connected to SuperMicro chassis: 847E1C-R1K28JBOD > > > > Is that where the SSDs live? I'm not a fan of such heavy expander > > over-subscription, but if you are getting good results outside of Ceph > > I'm guessing it's something else. > > > > No, the SSD's are connected to the integrated intel sata controller > (C610/X99) > > The only disks that reside in the SuperMicro chasis are the SATA drives. > And on the last tests I don't use them - the results I gave are on SSD's > only (one SSD serves as OSD and the journal is on another SSD). > > >> > >> Four monitors (one on each node). We do not use CephFS so we do not > >> run ceph-mds. > > > > You'll want to go down to 3 or up to 5. Even numbers of monitors don't > > really help you in any way (and can actually hurt). I'd suggest 3. > > > > OK, will do that, thanks! > > > > > You didn't mention the brand/model of SSDs. Especially for writes this > > is important as ceph journal writes are O_DSYNC. Drives that have > > proper write loss protection often can ignore ATA_CMD_FLUSH and do > > these very quickly while other drives may need to flush to the flash > > cells. Also, keep in mind for writes that if you have journals on the > > SSDs and 3X replication, you'll be doing 6 writes for every client > > write. > > > > SSD's are INTEL SSDSC2BW240A4 Intel, they make great SSDs. And horrid product numbers in SMART to go with their differently marketed/named devices. Amyway, those are likely your problem, see: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-November/035695.html Or any google result with "Ceph intel 530" probably. When you run those tests, did you use atop or iostat to watch the SSD utilization? Christian > The rbd pool is set to have min_size 1 and size 2. > > > For reads and read IOPs on SSDs, you might try disabling in-memory > > logging and ceph authentication. You might be interested in some > > testing we did on a variety of SSDs here: > > > > http://www.spinics.net/lists/ceph-users/msg15733.html > > > > Will read up on that too, thanks! > > J > -- Christian Balzer Network/Systems Engineer chibi@xxxxxxx Global OnLine Japan/Fusion Communications http://www.gol.com/ _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com