Re: Poor Read Performance with Ubuntu 14.04 LTS 3.19.0-30 Kernel

Nick Fisk <nick@xxxxxxxxxx> · Tue, 6 Oct 2015 22:15:25 +0100

I'm wondering if you are hitting the "bug" with the readahead changes?

I know the changes to limit readahead to 2MB was introduced in 3.15, but I
don't know if it was back ported into 3.13 or not. I have a feeling this may
also limit maximum request size to 2MB as well.

If you look in iostat do you see different request sizes between the two
kernels?

There is a 4.2 kernel with the readahead change reverted, it might be worth
testing it.

http://gitbuilder.ceph.com/kernel-deb-precise-x86_64-basic/ref/ra-bring-back
/

> -----Original Message-----
> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of
> MailingLists - EWS
> Sent: 06 October 2015 18:12
> To: ceph-users@xxxxxxxxxxxxxx
> Subject: Re:  Poor Read Performance with Ubuntu 14.04 LTS
> 3.19.0-30 Kernel
> 
> > Hi,
> >
> > Very interesting!  Did you upgrade the kernel on both the OSDs and
> > clients
> or
> > just some of them?  I remember there were some kernel performance
> > regressions a little while back.  You might try running perf during
> > your
> tests
> > and look for differences.  Also, iperf might be worth trying to see if
> it's a
> > network regression.
> >
> > I also have a script that compares output from sysctl which might be
> > worth trying to see if any defaults changes.
> >
> > https://github.com/ceph/cbt/blob/master/tools/compare_sysctl.py
> >
> > basically just save systctl -a with both kernels and pass them as
> arguments to
> > the python script.
> >
> > Mark
> 
> Mark,
> 
> The testing was done with 3.19 on the client with 3.13 on the OSD nodes
> using "rados bench -p bench 50 seq" with an initial "rados bench -p bench
50
> write --no-cleanup". We suspected the network as well and tested with
iperf
> as one of our first steps and saw expected speeds (9.9Gb/s as we are using
> bonded X540-T2 interfaces) on both kernels. As an added data point, we
> have no problem with write performance to the same pool with the same
> kernel configuration (~1GB/s). We also checked the values of
> read_ahead_kb of the block devices but both were shown to be the default
> of 128 (we have since changed these to 4096 in our configuration, but the
> results were seen with the default of 128).
> 
> We are in the process of rebuilding the entire cluster to use 3.13 and a
> completely fresh installation of Ceph to make sure nothing else is at play
> here.
> 
> We did check a few things in iostat and collectl, but we didn't see any
read IO
> against the OSDs, so I am leaning towards something further up the stack.
> 
> Just a little more background on the cluster configuration:
> 
> Specific pool created just for benchmarking, using 512 pgs and pgps and 2
> replicas. Using 3 OSD nodes (also handling MON duties) with 8 SATA 7.2K
> RPM OSDs and 2 NVMe journals (4 OSD to 1 Journal ratio). 1 x Hex core CPUs
> with 32GB of RAM per OSD node.
> 
> Tom
> 
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com