+1 On Thu, Sep 5, 2013 at 4:18 PM, Anand Avati <avati at gluster.org> wrote: > > On Thu, Sep 5, 2013 at 2:53 PM, Stephen Watt <swatt at redhat.com> wrote: > >> Hi Folks >> >> We are pleased to announce a major update to the glusterfs-hadoop project >> with the release of version 2.1. The glusterfs-hadoop project, available at >> The glusterfs-hadoop project team, provides an Apache licensed Hadoop >> FileSystem plugin which enables Apache Hadoop 1.x and 2.x to run directly >> on top of GlusterFS. This release includes a re-architected plugin which >> now extends existing functionality within Hadoop to run on local and POSIX >> File Systems. >> >> -- Overview -- >> >> Apache Hadoop has a pluggable FileSystem Architecture. This means that if >> you have a filesystem or object store that you would like to use with >> Hadoop, you can create a Hadoop FileSystem plugin for it which will act as >> a mediator between the generic Hadoop FileSystem interface and your >> filesystem of choice. A popular example would be that over a million Hadoop >> clusters are spun up on Amazon every year, a lot of which use Amazon S3 as >> the Hadoop FileSystem. >> >> In order to configure the plugin, a specific deployment configuration is >> required. Firstly, it is required that the Hadoop JobTracker and >> TaskTrackers (or the Hadoop 2.x equivalents) are installed on servers >> within the gluster trusted storage pool for a given gluster volume. The >> JobTracker uses the plugin to query the extended attributes for job input >> files in gluster to ascertain file placement as well as the distribution of >> file replicas across the cluster. The TaskTrackers use the plugin to >> leverage a local fuse mount of the gluster volume in order to access the >> data required for the tasks. When the JobTracker receives a Hadoop job, it >> uses the locality information it ascertains via the plugin to send the >> tasks for the Hadoop Job to Hadoop TaskTrackers on servers that have the >> data required for the task within their local bricks. This ensures data is >> read from disk and not over the network. Please see the attached diagram >> which provides an overview of the entire solution for a Hadoop 1.x >> deployment. >> >> The community project, along with the documentation and available >> releases, is hosted within the Gluster Forge at >> http://forge.gluster.org/hadoop. The glusterfs-hadoop project will also >> be available within the Fedora 20 release later this year, alongside fellow >> Fedora newcomer Apache Hadoop and the already available gluster project. >> The glusterfs-hadoop project team welcomes contributions and participation >> from the broader community. >> >> Stay tuned for upcoming posts around GlusterFS integration into the >> Apache Ambari and Fedora projects. >> >> Regards >> The glusterfs-hadoop project team >> _______________________________________________ >> Announce mailing list >> Announce at gluster.org >> http://supercolony.gluster.org/mailman/listinfo/announce >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://supercolony.gluster.org/mailman/listinfo/gluster-users >> > > Congratulations! This is great news!! > > Avati > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users > -- *Religious confuse piety with mere ritual, the virtuous confuse regulation with outcomes* -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130906/5aa495f9/attachment-0001.html>