Re: FW: Ceph data locality

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Exactly because of that issue I've reduced the number of Ceph replications to 2 and the number of HDFS copies is also 2 (so we're talking about 4 copies).
I want (but didn't tried yet) to change Ceph replication to 1 and change HDFS back to 3.


-----Original Message-----
From: Lionel Bouton [mailto:lionel+ceph@xxxxxxxxxxx] 
Sent: Tuesday, July 07, 2015 7:11 PM
To: Dmitry Meytin
Cc: ceph-users@xxxxxxxxxxxxxx
Subject: Re:  FW: Ceph data locality

On 07/07/15 17:41, Dmitry Meytin wrote:
> Hi Lionel,
> Thanks for the answer.
> The missing info:
> 1) Ceph 0.80.9 "Firefly"
> 2) map-reduce makes sequential reads of blocks of 64MB (or 128 MB)
> 3) HDFS which is running on top of Ceph is replicating data for 3 
> times between VMs which could be located on the same physical host or 
> different hosts

Hdfs on top of Ceph? How does it work exactly? If you run VMs backed by RBD which are then used in Hadoop to build HDFS this will mean that HDFS makes 3 copies and with default Ceph pool size=3 this would make 9 copies of the same data. If I understand this right this is very inefficient.

Lionel
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux