I'm wondering if you are hitting the "bug" with the readahead changes? I know the changes to limit readahead to 2MB was introduced in 3.15, but I don't know if it was back ported into 3.13 or not. I have a feeling this may also limit maximum request size to 2MB as well. If you look in iostat do you see different request sizes between the two kernels? There is a 4.2 kernel with the readahead change reverted, it might be worth testing it. http://gitbuilder.ceph.com/kernel-deb-precise-x86_64-basic/ref/ra-bring-back / > -----Original Message----- > From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of > MailingLists - EWS > Sent: 06 October 2015 18:12 > To: ceph-users@xxxxxxxxxxxxxx > Subject: Re: Poor Read Performance with Ubuntu 14.04 LTS > 3.19.0-30 Kernel > > > Hi, > > > > Very interesting! Did you upgrade the kernel on both the OSDs and > > clients > or > > just some of them? I remember there were some kernel performance > > regressions a little while back. You might try running perf during > > your > tests > > and look for differences. Also, iperf might be worth trying to see if > it's a > > network regression. > > > > I also have a script that compares output from sysctl which might be > > worth trying to see if any defaults changes. > > > > https://github.com/ceph/cbt/blob/master/tools/compare_sysctl.py > > > > basically just save systctl -a with both kernels and pass them as > arguments to > > the python script. > > > > Mark > > Mark, > > The testing was done with 3.19 on the client with 3.13 on the OSD nodes > using "rados bench -p bench 50 seq" with an initial "rados bench -p bench 50 > write --no-cleanup". We suspected the network as well and tested with iperf > as one of our first steps and saw expected speeds (9.9Gb/s as we are using > bonded X540-T2 interfaces) on both kernels. As an added data point, we > have no problem with write performance to the same pool with the same > kernel configuration (~1GB/s). We also checked the values of > read_ahead_kb of the block devices but both were shown to be the default > of 128 (we have since changed these to 4096 in our configuration, but the > results were seen with the default of 128). > > We are in the process of rebuilding the entire cluster to use 3.13 and a > completely fresh installation of Ceph to make sure nothing else is at play > here. > > We did check a few things in iostat and collectl, but we didn't see any read IO > against the OSDs, so I am leaning towards something further up the stack. > > Just a little more background on the cluster configuration: > > Specific pool created just for benchmarking, using 512 pgs and pgps and 2 > replicas. Using 3 OSD nodes (also handling MON duties) with 8 SATA 7.2K > RPM OSDs and 2 NVMe journals (4 OSD to 1 Journal ratio). 1 x Hex core CPUs > with 32GB of RAM per OSD node. > > Tom > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com