Re: Analysing ceph performance with SSD journal, 10gbe NIC and 2 replicas -Hammer release

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

On Fri, 6 Jan 2017 08:40:36 +0530 kevin parrikar wrote:

> Hello All,
> 
> I have setup a ceph cluster based on 0.94.6 release in  2 servers each with
> 80Gb intel s3510 and 2x3 Tb 7.2 SATA disks,16 CPU,24G RAM
> which is connected to a 10G switch with a replica of 2 [ i will add 3 more
> servers to the cluster] and 3 seperate monitor nodes which are vms.
> 
I'd go to the latest hammer, this version has a lethal cache-tier bug if
you should decide to try that.

80Gb Intel DC S3510 are a) slow and b) have only 0.3 DWPD.
You're going to wear those out quickly and if not replaced in time loose
data.

2 HDDs give you a theoretical speed of something like 300MB/s sustained,
when used a OSDs I'd expect the usual 50-60MB/s per OSD due to
seeks, journal (file system) and leveldb overheads. 
Which perfectly matches your results.

> rbd_cache is enabled in configurations,XFS filesystem,LSI 92465-4i raid
> card with 512Mb cache [ssd is in writeback mode wth BBU]
> 
> 
> Before installing ceph, i tried to check max throughpit of intel 3500  80G
> SSD using block size of 4M [i read somewhere that ceph uses 4m objects] and
> it was giving 220mbps {dd if=/dev/zero of=/dev/sdb bs=4M count=1000
> oflag=direct}
> 
Irrelevant, sustained sequential writes will be limited by what your OSDs
(HDDs) can sustain.

> *Observation:*
> Now the cluster is up and running and from the vm i am trying to write a 4g
> file to its volume using dd if=/dev/zero of=/dev/sdb bs=4M count=1000
> oflag=direct .It takes aroud 39 seconds to write.
> 
>  during this time ssd journal was showing disk write of 104M on both the
> ceph servers (dstat sdb) and compute node a network transfer rate of ~110M
> on its 10G storage interface(dstat -nN eth2]
> 
As I said, sounds about right.

> 
> my questions are:
> 
> 
>    - Is this the best throughput ceph can offer or can anything in my
>    environment be optmised to get  more performance? [iperf shows a max
>    throughput 9.8Gbits/s]
>
Not your network.

Watch your nodes with atop and you will note that your HDDs are maxed out.
 
> 
> 
>    - I guess Network/SSD is under utilized and it can handle more writes
>    how can this be improved to send more data over network to ssd?
> 
As jiajia wrote, a cache-tier might give you some speed boosts. 
But with those SSDs I'd advise against it, both too small and too low
endurance.

> 
> 
>    - rbd kernel module wasn't loaded on compute node,i loaded it manually
>    using "modprobe" and later destroyed/re-created vms,but this doesnot give
>    any performance boost. So librbd and RBD are equally fast?
> 
Irrelevant and confusing.
Your VMs will use on or the other depending on how they are configured.

> 
> 
>    - Samsung evo 840 512Gb shows a throughput of 500Mbps for 4M writes [dd
>    if=/dev/zero of=/dev/sdb bs=4M count=1000 oflag=direct] and for 4Kb it was
>    equally fast as that of intel S3500 80gb .Does changing my SSD from intel
>    s3500 100Gb to Samsung 840 500Gb make any performance  difference here just
>    because for 4M wirtes samsung 840 evo is faster?Can Ceph utilize this extra
>    speed.Since samsung evo 840 is faster in 4M writes.
> 
Those SSDs would be an even worse choice for endurance/reliability
reasons, though their larger size offsets that a bit.

Unless you have a VERY good understanding and data on how much your
cluster is going to write, pick at the very least SSDs with 3+ DWPD
endurance like the DC S3610s.
In very light loaded cases DC S3520 with 1DWPD may be OK, but again, you
need to know what you're doing here.

Christian
> 
> Can somebody help me understand this better.
> 
> Regards,
> Kevin


-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Global OnLine Japan/Rakuten Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux