I will dig into the network and determine if we have any issues. One thing to note is our MTU is 1500 and will not be changed for this test....simply put, I am not going to be able to get these changes implemented in our current network . I dont expect a huge increase in performance by moving to jumbo frames and I suspect not necessarily worth it for a POC and not the reason my cluster performance is sucking so bad at this particular moment.
One other thing I wanted to get clarity on was your rbd perf(dd) tests. I was under the impression that rbd devices are striped across all of the OSD's, where when writing via objects and files, the object would be getting written to a single disk. If my understanding is true, a dd would yield significantly better results(throughput/iops) for a rbd vs file OR object. Please let me know if I am missing something.
thank you.
On Tue, Aug 4, 2015 at 2:53 PM, Shane Gibson <Shane_Gibson@xxxxxxxxxxxx> wrote:
Bob,Those numbers would seem to indicate some other problem .... One of the biggest culprits of that poor performance is often related to network issues. In the last few months, there have been several reported issues of performance, that have turned out to be network. Not all, but most. You're best bet is to check each host interface statistics for errors. make sure you have a match on the MTU size (jumbo frames settings on the host and on your switches). Check your switches for network errors. Try extended size ping checks between nodes, insure you set the packet size close to your max MTU size and check that you're getting good performance from *all nodes* to every other node. Last, try a network performance test to each of the OSD nodes and see if one of them is acting up.If you are backing your journal on SSD - you DEFINITELY should be getting vastly better performance than that. I have a cluster with 6 OSD nodes w/ 10x 4TB OSDs - using 2 7200 rpm disks as the journatl (12 disks total). NO SSDs in that configuration. I can push the cluster to about 650 MByte/sec via network RBD 'dd' tests, and get about 2500 IOPS. NOTE - this is an all spinning HDD cluster w/ 7200 rpm disks!~~shaneOn 8/4/15, 2:36 PM, "ceph-users on behalf of Bob Ababurko" <ceph-users-bounces@xxxxxxxxxxxxxx on behalf of bob@xxxxxxxxxxxx> wrote:I have my first ceph cluster up and running and am currently testing cephfs for file access. It turns out, I am not getting excellent write performance on my cluster via cephfs(kernel driver) and would like to try to explore moving my cephfs_metadata pool to SSD.To quickly describe the cluster:all nodes run Centos 7.1 w/ ceph-0.94.1(hammerhead)[bababurko@cephosd01 ~]$ uname -r3.10.0-229.el7.x86_64[bababurko@cephosd01 ~]$ cat /etc/redhat-releaseCentOS Linux release 7.1.1503 (Core)6 OSD nodes w/ 5 x 1TB(7200 rpm/dont have model handy) sata & 1 TB SSD(850 pro) which includes a journal(5GB) for each of the 5 OSD's, so there is much space on the SSD left to create a partition for a SSD pool...at least 900GB per SSD. Also noteworthy is that these disks are behind a raid controller(LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2) with each disk configured as raid 0.3 MON nodes1 MDS nodeMy writes are not going as I would expect wrt to IOPS(50-1000 IOPs) & write throughput( ~25MB/s max). I'm interested in understanding what it takes to create a SSD pool that I can then migrate the current Cephfs_metadata pool to. I suspect that the spinning disk metadata pool is a bottleneck and I want to try to get the max performance out of this cluster to prove that we would build out a larger version. One caveat is that I have copied about 4 TB of data to the cluster via cephfs and dont want to lose the data so I obviously need to keep the metadata intact.If anyone has done this OR understands how this can be done, I would appreciate the advice.thanks in advance,Bob
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com