You can also have Hadoop talking to the Rados Gateway (SWIFT API) so that the data is in Ceph instead of HDFS. I wrote this tutorial that might help: https://github.com/zioproto/hadoop-swift-tutorial Saverio 2016-04-30 23:55 GMT+02:00 Adam Tygart <mozes@xxxxxxx>: > Supposedly cephfs-hadoop worked and/or works on hadoop 2. I am in the > process of getting it working with cdh5.7.0 (based on hadoop 2.6.0). > I'm under the impression that it is/was working with 2.4.0 at some > point in time. > > At this very moment, I can use all of the DFS tools built into hadoop > to create, list, delete, rename, and concat files. What I am not able > to do (currently) is run any jobs. > > https://github.com/ceph/cephfs-hadoop > > It can be built using current (at least infernalis with my testing) > cephfs-java and libcephfs. The only thing you'll for sure need to do > is patch the file referenced here: > https://github.com/ceph/cephfs-hadoop/issues/25 When building, you'll > want to tell maven to skip tests (-Dmaven.test.skip=true). > > Like I said, I am digging into this still, and I am not entirely > convinced my issues are ceph related at the moment. > > -- > Adam > > On Sat, Apr 30, 2016 at 1:51 PM, Erik McCormick > <emccormick@xxxxxxxxxxxxxxx> wrote: >> I think what you are thinking of is the driver that was built to actually >> replace hdfs with rbd. As far as I know that thing had a very short lifespan >> on one version of hadoop. Very sad. >> >> As to what you proposed: >> >> 1) Don't use Cephfs in production pre-jewel. >> >> 2) running hdfs on top of ceph is a massive waste of disk and fairly >> pointless as you make replicas of replicas. >> >> -Erik >> >> On Apr 29, 2016 9:20 PM, "Bill Sharer" <bsharer@xxxxxxxxxxxxxx> wrote: >>> >>> Actually this guy is already a fan of Hadoop. I was just wondering >>> whether anyone has been playing around with it on top of cephfs lately. It >>> seems like the last round of papers were from around cuttlefish. >>> >>> On 04/28/2016 06:21 AM, Oliver Dzombic wrote: >>>> >>>> Hi, >>>> >>>> bad idea :-) >>>> >>>> Its of course nice and important to drag developer towards a >>>> new/promising technology/software. >>>> >>>> But if the technology under the individual required specifications does >>>> not match, you will just risk to show this developer how worst this >>>> new/promising technology is. >>>> >>>> So you will just reach the opposite of what you want. >>>> >>>> So before you are doing something, usually big, like hadoop on an >>>> unstable software, maybe you should not use it. >>>> >>>> For the good of the developer, for your good and for the good of the >>>> reputation of the new/promising technology/software you wish. >>>> >>>> To force a pinguin to somehow live in the sahara, might be possible ( at >>>> least for some time ), but usually not a good idea ;-) >>>> >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@xxxxxxxxxxxxxx >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com