Hi, Did you test GlusterFS write performance (using 'dd') *only* from the client mount ? I ask this because GlusterFS Hadoop plugin does a FUSE mount on *every* node in the cluster. So during the map phase, when jobs get assigned to slaves; all I/O will be done via FUSE (which is mostly reads). Similarly, during Reduce phase, the reduce jobs would be writing to the FUSE mount (on their respective nodes). Can you try doing the 'dd' test on all nodes in the cluster parallely (on the FUSE mount) on the 2x2 Distribute-Replicate setup and let us know the numbers (throughput numbers from all nodes would be helpful, if possible). Write performance in HDFS is exceptionally well because of it's aggressive client side caching (HDFS relaxes a POSIX requirement to get higher write throughput). Thanks, -Venky ________________________________ From: ???(yongjoon kong)/Cloud Computing ????/SKCC [andrew.kong at sk.com] Sent: Wednesday, October 19, 2011 11:04 PM To: Venky Shankar; andrew; gluster-users at gluster.org Subject: RE: gluster map/reduce performance.. Yes, I used the GlusterFS plugin. Gluster version is - 3.3 beta 2. For the Volumes Distributed-mirroring volume: Using 4 server and 2(brick)x2(replica) configuration Stripe-mirroring volume : Using 4 Server and 4(stripe count) x 2 (repica) configuration For the Map/reduce system I user 6 server ( 4 is the brick server and other 2 is for just map/reduce ) I checked your source file, but I can?t find any clue for the Performance degradation in Merging Stage. ( I think it is connected with writing) Actaully, In writing test, Gluster was quite good. So I?m little confused right now. Regards Andrew From: gluster-users-bounces at gluster.org [mailto:gluster-users-bounces at gluster.org] On Behalf Of Venky Shankar Sent: Thursday, October 20, 2011 1:35 AM To: andrew; gluster-users at gluster.org Subject: Re: gluster map/reduce performance.. Hi there, Appreciate if you could share the following info with us: * Are you using GlusterFS hadoop plugin (which is here http://download.gluster.com/pub/gluster/glusterfs/qa-releases/3.3-beta-2/glusterfs-hadoop-0.20.2-0.1.x86_64.rpm and is still in beta) or are you using GlusterFS as an additional layer below Hadoop's FileSystem (HDFS) ? The latter is basically configuring Hadoop to use GlusterFS mount point (e.g. FUSE mount) as the data directory for Hadoop's DFS. Let us know your setup (including GlusterFS version) to debug further. Thanks, -Venky ________________________________ From: gluster-users-bounces at gluster.org [gluster-users-bounces at gluster.org] on behalf of andrew [sstrato.kong at gmail.com] Sent: Wednesday, October 19, 2011 6:15 PM To: gluster-users at gluster.org Subject: gluster map/reduce performance.. Hi, all, i try to check the performance of Map/Reduce of Gluster File system. Mapper side speed is quite good and it is sometimes faster than hadoop's map job. But in the Reduce Side job is much slower than hadoop. i analyze the result and i found the primary reason of slow speed is bad performance in Merging stage. Would you have any suggestion for this issue FYI check the blog http://storage4com.blogspot.com/ thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://gluster.org/pipermail/gluster-users/attachments/20111020/968e6284/attachment.htm>