MapReduce on glusterfs in Hadoop

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi All,



Thanks a lot for taking out your time to answer my question.



I am trying to implement a file system in hadoop under irg.apache.hadoop.fs
package something similar to KFS, glusterfs, etc. I wanted to know is that
in README.txt of glusterfs it is mentioned :



>> # ./bin/start-mapred.sh

  If the map/reduce job/task trackers are up, all I/O will be done to
GlusterFS.



So, suppose my input files are scattered in different nodes(glusterfs
servers), how do I(hadoop client having glusterfs plugged in) issue a
Mapreduce command?

Moreover, after issuing a Mapreduce command would my hadoop client fetch
all the data from different servers to my local machine and then do a
Mapreduce or would it start the TaskTracker daemons on the machine(s) where
the input file(s) are located and perform a Mapreduce there?

Please rectify me if I am wrong but I suppose that the location of input
files top Mapreduce is being returned by the function *getFileBlockLocations
* *(*FileStatus file*,* *long* start*,* *long* len*). *



Thank you very much for your time and helping me out.



Regards,

Nikhil
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130222/03707df5/attachment-0001.html>


[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux