I performed this kernel upgrade (to 3.19.30) over the weekend on my cluster, and my before / after benchmarks were very close to each other, about 500MB/s each.
On Tue, Oct 6, 2015 at 3:15 PM, Nick Fisk <nick@xxxxxxxxxx> wrote:
I'm wondering if you are hitting the "bug" with the readahead changes?
I know the changes to limit readahead to 2MB was introduced in 3.15, but I
don't know if it was back ported into 3.13 or not. I have a feeling this may
also limit maximum request size to 2MB as well.
If you look in iostat do you see different request sizes between the two
kernels?
There is a 4.2 kernel with the readahead change reverted, it might be worth
testing it.
http://gitbuilder.ceph.com/kernel-deb-precise-x86_64-basic/ref/ra-bring-back
/
> -----Original Message-----
> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of
> MailingLists - EWS
> Sent: 06 October 2015 18:12
> To: ceph-users@xxxxxxxxxxxxxx
> Subject: Re: Poor Read Performance with Ubuntu 14.04 LTS
> 3.19.0-30 Kernel
>
> > Hi,
> >
> > Very interesting! Did you upgrade the kernel on both the OSDs and
> > clients
> or
> > just some of them? I remember there were some kernel performance
> > regressions a little while back. You might try running perf during
> > your
> tests
> > and look for differences. Also, iperf might be worth trying to see if
> it's a
> > network regression.
> >
> > I also have a script that compares output from sysctl which might be
> > worth trying to see if any defaults changes.
> >
> > https://github.com/ceph/cbt/blob/master/tools/compare_sysctl.py
> >
> > basically just save systctl -a with both kernels and pass them as
> arguments to
> > the python script.
> >
> > Mark
>
> Mark,
>
> The testing was done with 3.19 on the client with 3.13 on the OSD nodes
> using "rados bench -p bench 50 seq" with an initial "rados bench -p bench
50
> write --no-cleanup". We suspected the network as well and tested with
iperf
> as one of our first steps and saw expected speeds (9.9Gb/s as we are using
> bonded X540-T2 interfaces) on both kernels. As an added data point, we
> have no problem with write performance to the same pool with the same
> kernel configuration (~1GB/s). We also checked the values of
> read_ahead_kb of the block devices but both were shown to be the default
> of 128 (we have since changed these to 4096 in our configuration, but the
> results were seen with the default of 128).
>
> We are in the process of rebuilding the entire cluster to use 3.13 and a
> completely fresh installation of Ceph to make sure nothing else is at play
> here.
>
> We did check a few things in iostat and collectl, but we didn't see any
read IO
> against the OSDs, so I am leaning towards something further up the stack.
>
> Just a little more background on the cluster configuration:
>
> Specific pool created just for benchmarking, using 512 pgs and pgps and 2
> replicas. Using 3 OSD nodes (also handling MON duties) with 8 SATA 7.2K
> RPM OSDs and 2 NVMe journals (4 OSD to 1 Journal ratio). 1 x Hex core CPUs
> with 32GB of RAM per OSD node.
>
> Tom
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com