Re: [PATCH] Expose Ceph data location information to Hadoop

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>> Are we sure JNI is a real problem?  It really seems like the right tool 
>> for the job.  Greg seems to remember them asking who would maintain the 
>> (non-java) JNI bits, but even if that's us and not them (which is probably 
>> the way to go anyway), I don't see that that's a problem.
> 
> Yeh, it's sort of a wash. A nice goal would be to have a patch that allowed Hadoop to not require any additional components (i.e. JNI packages) from the Ceph repository. Given that the Ceph infrastructure will be installed anyway in the case of Hadoop, it's a bit of a toss up.

The JNI isn't very _fun_ to develop, but it does do the job just fine and with the expected pattern of using a stable interface, with nothing extravagant needed for either Hadoop or Ceph.  Hadoop already has JNI pieces, so adding more shouldn't be a problem (though I do wish the automake part wasn't so awkward to approach).

I suppose there will need to be some automated check for Ceph as part of the ant build process.

> 
> -n
> 
>> Let's start with just providing the primary replica, at least until we 
>> find out whether hadoop takes advantage of additional ones (does HDFS read 
>> from the local non-primary replica?).
> 
> I believe that Hadoop will schedule a map job on at a local replica for load balancing, or to duplicate the work when a map is running slowly. Joe, can you confirm this?
> 
When I ran my basic evaluation, Hadoop was reporting its locality results as about 75% of jobs being run on the same node as the data.  This seemed to be a result of overloading nodes.  Someone will need to run a proper evaluation, as my experiment was small and blew up when I expanded my test cluster.  It was probably a misconfigured kernel upgrade or something else uninteresting that's irrelevant here.

--Alex--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux