Designing a cluster with ceph and benchmark (ceph vs ext4)

listas@xxxxxxxxxxxxxxxxx (Listas@Adminlinux) · Mon, 26 May 2014 10:50:53 -0300

Thanks Cristian. I will reflect on what you told me. There is no free 
lunch, I'll think it's worth paying the price.

--
Thiago Henrique

Em 24-05-2014 02:43, Christian Balzer escreveu:
>
> Hello,
>
> On Fri, 23 May 2014 15:41:23 -0300 Listas at Adminlinux wrote:
>
>> Hi !
>>
>> I have failover clusters for some aplications. Generally with 2 members
>> configured with Ubuntu + Drbd + Ext4. For example, my IMAP cluster works
>> fine with ~ 50k email accounts and my HTTP cluster hosts ~2k sites.
>>
> My mailbox servers are also multiple DRBD based cluster pairs.
> For performance in fully redundant storage there is isn't anything better
> (in the OSS, generic hardware section at least).
>
>> See design here: http://adminlinux.com.br/cluster_design.txt
>>
>> I would like to provide load balancing instead of just failover. So, I
>> would like to use a distributed architecture of the filesystem. As we
>> know, Ext4 isn't a distributed filesystem. So wish to use Ceph in my
>> clusters.
>>
> You will find that all cluster/distributed filesystems have severe
> performance shortcomings when compared to something like Ext4.
>
> On top of that, CephFS isn't ready for production as the MDS isn't HA.
>
> A potential middle way might be to use Ceph/RBD volumes formatted in Ext4.
> That doesn't give you shared access, but it will allow you to separate
> storage and compute nodes, so when one compute node becomes busy, mount
> that volume from a more powerful compute node instead.
>
> That all said, I can't see any way and reason to replace my mailbox DRBD
> clusters with Ceph in the foreseeable future.
> To get similar performance/reliability to DRBD I would have to spend 3-4
> times the money.
>
> Where Ceph/RBD works well is situations where you can't fit the compute
> needs into a storage node (as required with DRBD) and where you want to
> access things from multiple compute nodes, primarily for migration
> purposes.
> In short, as a shared storage for VMs.
>
>> Any suggestions for design of the cluster with Ubuntu+Ceph?
>>
>> I built a simple cluster of 2 servers to test simultaneous reading and
>> writing with Ceph. My conf:  http://adminlinux.com.br/ceph_conf.txt
>>
> Again, CephFS isn't ready for production, but other than that I know very
> little about it as I don't use it.
> However your version of Ceph is severely outdated, you really should be
> looking at something more recent to rule out you're experience long fixed
> bugs. The same goes for your entire setup and kernel.
>
> Also Ceph only starts to perform decently with many OSDs (disks) and
> the journals on SSDs instead of being on the same disk.
> Think DRBD AL metadata-internal, but with MUCH more impact.
>
> Regards,
>
> Christian
>> But in my simultaneous benchmarks found errors in reading and writing. I
>> ran "iozone -t 5 -r 4k -s 2m" simultaneously on both servers in the
>> cluster. The performance was poor and had errors like this:
>>
>> Error in file: Found ?0? Expecting ?6d6d6d6d6d6d6d6d? addr b6600000
>> Error in file: Position 1060864
>> Record # 259 Record size 4 kb
>> where b6600000 loop 0
>>
>> Performance graphs of benchmark: http://adminlinux.com.br/ceph_bench.html
>>
>> Can you help me find what I did wrong?
>>
>> Thanks !
>>
>
>