Re: Having trouble getting good performance

Nick Fisk <nick@xxxxxxxxxx> · Thu, 23 Apr 2015 20:05:03 +0100

> -----Original Message-----
> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of
> J David
> Sent: 23 April 2015 17:51
> To: Nick Fisk
> Cc: ceph-users@xxxxxxxxxxxxxx
> Subject: Re:  Having trouble getting good performance
> 
> On Wed, Apr 22, 2015 at 4:30 PM, Nick Fisk <nick@xxxxxxxxxx> wrote:
> > I suspect you are hitting problems with sync writes, which Ceph isn't
> > known for being the fastest thing for.
> 
> There's "not being the fastest thing" and "an expensive cluster of
hardware
> that performs worse than a single SATA drive." :-(
> 

If you are doing single threaded writes (not saying you are, but...) you
will get worse performance than a SATA drive as a write to each OSD will
actually involve 2-3x the number of source IO's, which is why SSD journals
help so much. A single threaded operation without SSD journals will probably
max out around 40-50 iops, this will then scale roughly in a linear fashion
until you hit iodepth=#disks/#replicas, where performance will then start to
tail off as latency increases.

If you can let us know the avg queue depth that ZFS is generating that will
probably give a good estimation of what you can expect from the cluster.

I'm using Ceph with ESXi over iSCSI which is a shocker for sync writes. A
single thread gets awful performance, however when hundreds of things start
happening at once, Ceph smiles and IOPs go into the thousands.

> > I'm not a big expert on ZFS but I do know that a SSD ZIL is normally
> > recommended to allow fast sync writes.
> 
> The VM already has an ZIL vdisk on an SSD on the KVM host.
> 
> You may be thinking of NFS traffic, which is where ZFS and sync writes
have
> such a long and difficult history.  As far as I know, zfs receive
operations do
> not do sync writes (except possibly at the beginning and end to establish
> barriers) since the operation as a whole already has fundamental
> transactionality built in.
> 
> > SSD Ceph journals may give you around 200-300 iops
> 
> SSD journals are not an option for this cluster.  We just need to get the
most
> out of what we have.
> 
> There are some fio results showing this problem isn't limited/specific to
ZFS
> which I will post in a separate message shortly.
> 

I have had a look through the fio runs, could you also try and run a couple
of jobs with iodepth=64 instead of numjobs=64. I know they should do the
same thing, but the numbers with the former are easier to understand. 

It may also be worth getting hold of the RBD enabled fio so that you can
test performance outside of the VM.

> Thanks!
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com