why use hadoop with ceph ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Friday, May 30, 2014, Ignazio Cassano <ignaziocassano at gmail.com> wrote:

> Hi all,
> I am testing ceph because I found it is very interesting as far as remote
> block
> device is concerned.
> But my company is very interested in big data.
> So I read something about hadoop and ceph integration.
> Anyone can suggest me some documentation explaining the purpose of
> ceph/hadoop integration ?
> Why don't use only hadoop for big data ?
>

It has a couple of advantages now:
1) if you're already running Ceph, you only need to manage one storage
cluster
2) you get all of Ceph's reliability, resiliency, and dynamism
3)  you get a real posix filesystem that you can run Hadoop workloads
against (which enables things like using other data Analytics systems
against it)

In the future, when CephFS is more fully supported for production use,
you'll also be able to do things like use Ceph as the canonical location of
all your data, and run Hadoop loads against it without having to so an
export/import, etc.
-Greg


-- 
Software Engineer #42 @ http://inktank.com | http://ceph.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140530/0becffa2/attachment.htm>


[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux