On 20/12/2015 22:51, Don Waterloo wrote: > All nodes have 10Gbps to each other Even the link client node <---> cluster nodes? > OSD: > $ ceph osd tree > ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY > -1 5.48996 root default > -2 0.89999 host nubo-1 > 0 0.89999 osd.0 up 1.00000 1.00000 > -3 0.89999 host nubo-2 > 1 0.89999 osd.1 up 1.00000 1.00000 > -4 0.89999 host nubo-3 > 2 0.89999 osd.2 up 1.00000 1.00000 > -5 0.92999 host nubo-19 > 3 0.92999 osd.3 up 1.00000 1.00000 > -6 0.92999 host nubo-20 > 4 0.92999 osd.4 up 1.00000 1.00000 > -7 0.92999 host nubo-21 > 5 0.92999 osd.5 up 1.00000 1.00000 > > Each contains 1 x Samsung 850 Pro 1TB SSD (on sata) > > Each are Ubuntu 15.10 running 4.3.0-040300-generic kernel. > Each are running ceph 0.94.5-0ubuntu0.15.10.1 > > nubo-1/nubo-2/nubo-3 are 2x X5650 @ 2.67GHz w/ 96GB ram. > nubo-19/nubo-20/nubo-21 are 2x E5-2699 v3 @ 2.30GHz, w/ 576GB ram. > > the connections are to the chipset sata in each case. > The fio test to the underlying xfs disk > (e.g. cd /var/lib/ceph/osd/ceph-1; fio --randrepeat=1 --ioengine=libaio > --direct=1 --gtod_reduce=1 --name=readwrite --filename=rw.data --bs=4k > --iodepth=64 --size=5000MB --readwrite=randrw --rwmixread=50) > shows ~22K IOPS on each disk. > > nubo-1/2/3 are also the mon and the mds: > $ ceph status > cluster b23abffc-71c4-4464-9449-3f2c9fbe1ded > health HEALTH_OK > monmap e1: 3 mons at {nubo-1= > 10.100.10.60:6789/0,nubo-2=10.100.10.61:6789/0,nubo-3=10.100.10.62:6789/0} > election epoch 1104, quorum 0,1,2 nubo-1,nubo-2,nubo-3 > mdsmap e621: 1/1/1 up {0=nubo-3=up:active}, 2 up:standby > osdmap e2459: 6 osds: 6 up, 6 in > pgmap v127331: 840 pgs, 6 pools, 144 GB data, 107 kobjects > 289 GB used, 5332 GB / 5622 GB avail > 840 active+clean > client io 0 B/s rd, 183 kB/s wr, 54 op/s And you have "replica size == 3" in your cluster, correct? Do you have specific mount options or specific options in ceph.conf concerning ceph-fuse? So the hardware configuration of your cluster seems to me globally highly better than my cluster (config given in my first message) because you have 10Gb links (between the client and the cluster I have just 1Gb) and you have full SSD OSDs. I have tried to put _all_ cephfs in my SSD: ie the pools "cephfsdata" _and_ "cephfsmetadata" are in the SSD. The performances are slightly improved because I have ~670 iops now (with the fio command of my first message again) but it still seems to me bad. In fact, I'm curious to have the opinion of "cephfs" experts to know what iops we can expect. If anaything, ~700 iops is a correct iops for our hardware configuration and maybe we are searching a problem which doesn't exist... -- François Lafont _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com