Re: FW: Ceph data locality

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 07/07/15 18:20, Dmitry Meytin wrote:
> Exactly because of that issue I've reduced the number of Ceph replications to 2 and the number of HDFS copies is also 2 (so we're talking about 4 copies).
> I want (but didn't tried yet) to change Ceph replication to 1 and change HDFS back to 3.

You are stacking a distributed storage network on top of another, no
wonder you find the performance below your expectations.

You could (should?) use CephFS instead of HDFS on RBD backed VMs (as
this is clearly redundant and inefficient). Note that if you try to use
size=1 for your RBD pool instead (which will probably be slower than
using Hadoop with CephFS) and loose only one disk you will probably
freeze most or all of your VMs (as their disks will be split across all
physical disks of your Ceph cluster) and certainly corrupt all of their
filesystems.

See http://ceph.com/docs/master/cephfs/hadoop/

If this doesn't work for you I'll suggest separating the VMs system
disks from the Hadoop storage and run Hadoop storage nodes on bare
metal. VMs could either be backed by local disks or RBD if you need to
but in any case they should avoid creating any large IO spikes which
could disturb the Hadoop storage nodes.

Lionel
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux