Re: Performance test on Ceph cluster

Tommi Virtanen <tommi.virtanen@xxxxxxxxxxxxx> · Fri, 24 Feb 2012 10:51:10 -0800

On Fri, Feb 24, 2012 at 00:58, madhusudhana
<madhusudhana.u.acharya@xxxxxxxxx> wrote:
> 1. In my cluster, all OSD's are mkfs'ed with btrfs
> 2. Below is what i can see with ceph -s output. Is that mean, only one MDS
> is operation and another one is standby ?
>          mds e5: 1/1/1 up {0=ceph-node-1=up:active}, 1 up:standby

Yes, you have 1 active and 1 standby mds.

> 3. I will not be able to use new stable kernel bcz of company policy :-(

That might become an issue.

> 4. If you don't mind, can you please give me a bit of insight on cluster
> network, what it is and how i can configure one for my ceph cluster ?
> Will there be a significant performance improvement with this ?

When a client submits a write to Ceph, it needs to be replicated
(usually to two replicas). If all the ceph servers have two network
interfaces, and you have two separate networks you connect the servers
to, you can make the replication traffic go over the second interface,
and thus you'll have more available bandwidth between the cluster and
the clients in the first one.

Or you could just bond two 1 gig links, or you could buy 10gig gear.

> 5. I have done some testing with dd on ceph. Below are the results
>
> CASE 1:[root@ceph-node-9 ~]# dd if=/dev/zero of=/mnt/ceph-test/wtest bs=4k
> count=1000000
...
> As you can see from above output, for 4G file of 4k blocks, speed clocked at
> 1GB/s, it gradually decreased when i increased the file size above 10G.
> And also, if i run back to back dd with CASE 1 option, the write will
> slow down from 1GB/s to 90MB/s.
>
> can you please explain whether this behaviour is expected ? if yes, why ?
> if not, how i can achieve 1GB/s for all file sizes ?

dd is not a very good benchmark. The 4GB write is small enough to
typically be (mostly) stored in RAM even on your client machine --
your benchmark might not even be sending the data over the network
yet! You can get a little bit better by adding conv=fsync to your dd
command lines, that makes sure the data is written to disks before
claiming completion.

In general, you should look for a benchmark that is closer to your workload.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html