Re: [PATCH] Expose Ceph data location information to Hadoop

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> Are we sure JNI is a real problem?  It really seems like the right tool 
> for the job.  Greg seems to remember them asking who would maintain the 
> (non-java) JNI bits, but even if that's us and not them (which is probably 
> the way to go anyway), I don't see that that's a problem.

Yeh, it's sort of a wash. A nice goal would be to have a patch that allowed Hadoop to not require any additional components (i.e. JNI packages) from the Ceph repository. Given that the Ceph infrastructure will be installed anyway in the case of Hadoop, it's a bit of a toss up.

-n

> Let's start with just providing the primary replica, at least until we 
> find out whether hadoop takes advantage of additional ones (does HDFS read 
> from the local non-primary replica?).

I believe that Hadoop will schedule a map job on at a local replica for load balancing, or to duplicate the work when a map is running slowly. Joe, can you confirm this?

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux