On 24-6-2017 05:30, Christian Wuerdig wrote: > The general advice floating around is that your want CPUs with high > clock speeds rather than more cores to reduce latency and increase IOPS > for SSD setups (see also > http://www.sys-pro.co.uk/ceph-storage-fast-cpus-ssd-performance/) So > something like a E5-2667V4 might bring better results in that situation. > Also there was some talk about disabling the processor C states in order > to bring latency down (something like this should be easy to test: > https://stackoverflow.com/a/22482722/220986) I would be very careful to call this a general advice... Although the article is interesting, it is rather single sided. The only thing is shows that there is a lineair relation between clockspeed and write or read speeds??? The article is rather vague on how and what is actually tested. By just running a single OSD with no replication a lot of the functionality is left out of the equation. Nobody is running just 1 osD on a box in a normal cluster host. Not using a serious SSD is another source of noise on the conclusion. More Queue depth can/will certainly have impact on concurrency. I would call this an observation, and nothing more. --WjW > > On Sat, Jun 24, 2017 at 1:28 AM, Kostas Paraskevopoulos > <reverend.x3@xxxxxxxxx <mailto:reverend.x3@xxxxxxxxx>> wrote: > > Hello, > > We are in the process of evaluating the performance of a testing > cluster (3 nodes) with ceph jewel. Our setup consists of: > 3 monitors (VMs) > 2 physical servers each connected with 1 JBOD running Ubuntu Server > 16.04 > > Each server has 32 threads @2.1GHz and 128GB RAM. > The disk distribution per server is: > 38 * HUS726020ALS210 (SAS rotational) > 2 * HUSMH8010BSS200 (SAS SSD for journals) > 2 * ST1920FM0043 (SAS SSD for data) > 1 * INTEL SSDPEDME012T4 (NVME measured with fio ~300K iops) > > Since we don't currently have a 10Gbit switch, we test the performance > with the cluster in a degraded state, the noout flag set and we mount > rbd images on the powered on osd node. We confirmed that the network > is not saturated during the tests. > > We ran tests on the NVME disk and the pool created on this disk where > we hoped to get the most performance without getting limited by the > hardware specs since we have more disks than CPU threads. > > The nvme disk was at first partitioned with one partition and the > journal on the same disk. The performance on random 4K reads was > topped at 50K iops. We then removed the osd and partitioned with 4 > data partitions and 4 journals on the same disk. The performance > didn't increase significantly. Also, since we run read tests, the > journals shouldn't cause performance issues. > > We then ran 4 fio processes in parallel on the same rbd mounted image > and the total iops reached 100K. More parallel fio processes didn't > increase the measured iops. > > Our ceph.conf is pretty basic (debug is set to 0/0 for everything) and > the crushmap just defines the different buckets/rules for the disk > separation (rotational, ssd, nvme) in order to create the required > pools > > Is the performance of 100.000 iops for random 4K read normal for a > disk that on the same benchmark runs at more than 300K iops on the > same hardware or are we missing something? > > Best regards, > Kostas > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com> > > > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com