Hello, To make things clear, what I've done is : - deploying GlusterFS on 2, 4, 8, 16, 32, 64, 128 nodes - running a variant of the MAB benchmark (it's all about compilation of openssl-1.0.0) on 2, 4, 8, 16, 32, 64, 128 nodes - I used 'pdsh -f 512' to start MAB on all nodes at the same time - on each experiment on each node, I ran MAB in a dedicated directory within the glusterfs global namespace (e.g. nodeA used <gluster global namespace>/nodeA/<mab files>) to avoid a metadata storm on the parent directory inode - between each experiment, I destroy and redeploy a complete new GlusterFS setup (and I also destroy everything within each brick i.e the exported storage dir) I then compare the average compilation time vs the number of nodes ... and it increases due to the round robin scheduler that dispatches files on all the bricks 2 : Phase_V(s)avg 249.9332121175 4 : Phase_V(s)avg 262.808117374 8 : Phase_V(s)avg 293.572061537875 16 : Phase_V(s)avg 351.436554833375 32 : Phase_V(s)avg 546.503069517844 64 : Phase_V(s)avg 1010.61019479478 (phase V is related to the compilation itself, previous phases are about metadata ops) You can also try to compile a linux kernel on your own, this is pretty much the same thing. Now regarding the GlusterFS setup : yes, you're right, there is no replication so this is a simple stripping (on a file basis) setup Each time, I create a glusterfs volume featuring one brick, then i add bricks (one by one) till I reach the number of nodes ... and after that, I start the volume. Now regarding the 128bricks case, this is when I start the volume that I get a random error telling me that <brickX> does not respond, and this changes every time I retry to start the volume. So far, I didn't tested with a number of nodes between 64 and 128 Fran?ois On Friday, June 10, 2011 16:38 CEST, Pavan T C <tcp at gluster.com> wrote: > On Wednesday 08 June 2011 06:10 PM, Francois THIEBOLT wrote: > > Hello, > > > > I'm driving some experiments on grid'5000 with GlusterFS 3.2 and, as a > > first point, i've been unable to start a volume featuring 128bricks (64 ok) > > > > Then, due to the round-robin scheduler, as the number of nodes increase > > (every node is also a brick), the performance of an application on an > > individual node decrease! > > I would like to understand what you mean by "increase of nodes". You > have 64 bricks and each brick also acts as a client. So, where is the > increase in the number of nodes? Are you referring to the mounts that > you are doing? > > What is your gluster configuration - I mean, is it a distribute only, or > is it a distributed-replicate setup? [From your command sequence, it > should be a pure distribute, but I just want to be sure]. > > What is your application like? Is it mostly I/O intensive? It will help > if you provide a brief description of typical operations done by your > application. > > How are you measuring the performance? What parameter determines that > you are experiencing a decrease in performance with increase in the > number of nodes? > > Pavan > > > So my question is : how to STOP the round-robin distribution of files > > over the bricks within a volume ? > > > > *** Setup *** > > - i'm using glusterfs3.2 from source > > - every node is both a client node and a brick (storage) > > Commands : > > - gluster peer probe <each of the 128nodes> > > - gluster volume create myVolume transport tcp <128 bricks:/storage> > > - gluster volume start myVolume (fails with 128 bricks!) > > - mount -t glusterfs ...... on all nodes > > > > Feel free to tell me how to improve things > > > > Fran?ois > > >