Re: Cephfs Hadoop Plugin and CEPH integration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Why not use swift. The intergration has been around for a while, and may be a better fit. 

https://hadoop.apache.org/docs/stable/hadoop-openstack/index.html

On Mon, Nov 27, 2017 at 12:55 PM, Aristeu Gil Alves Jr <aristeu.jr@xxxxxxxxx> wrote:
Hi.

It's my first post on the list. First of all I have to say I'm new on hadoop. 

We are here a small lab and we have being running cephfs for almost two years, loading it with large files (4GB to 4TB in size). Our cluster is with approximately with 400TB with ~75% of usage, and we are planning to grow a lot.

Until now, we did process most of the files the "serial reading" way. But now we will try to implement a parallel process on this files and we are looking on the hadoop plugin as a solution for using mapreduce, or something like that.

Does the hadoop plugin access cephfs over the network as a normal cluster or I can install the hadoop's processors on every ceph node and process the data locally? 


Thanks and regards,

--

Aristeu

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux