Re: backing Hadoop with Ceph ??

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jul 15, 2015 at 10:50 PM, John Spray <john.spray@xxxxxxxxxx> wrote:
>
>
> On 15/07/15 16:57, Shane Gibson wrote:
>>
>>
>>
>> We are in the (very) early stages of considering testing backing Hadoop
>> via Ceph - as opposed to HDFS.  I've seen a few very vague references to
>> doing that, but haven't found any concrete info (architecture, configuration
>> recommendations, gotchas, lessons learned, etc...).   I did find the
>> ceph.com/docs/ info [1] which discusses use of CephFS for backing Hadoop -
>> but this would be foolish for production clusters given that CephFS isn't
>> yet considered production quality/grade.
>
>
> For analytics workloads where you're handling ephemeral datasets or scratch
> data, you might find that self-supporting a cephfs instance is a workable
> solution.  The in-development fsck parts of cephfs are usually more of a
> concern for long term storage use cases, and for providing fully
> vendor-supported systems.  I'd encourage you to try out the hadoop+cephfs
> setup and let us know what kind of issues you hit, if any.

Yep! The Hadoop workload is a fairly simple one that is unlikely to
break anything in CephFS. We run a limited set of Hadoop tests on it
every week and provide bindings to set it up; I think the
documentation is a bit lacking here but if you've ever used a
third-party FS with Hadoop I don't think it should be too challenging.
I'm hoping we get better documentation written up soonish.
-Greg
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux