Anand Avati a écrit : >> Here is another scenario : >> (..) > > if you observe the spec files a bit carefully, you will oberve that > the servers *never* communicate with each other. there is NO > connection between server1 and server2. file replication featue is on > the client side (afr translator is loaded on the client spec file). > the client itself writes to both server1 and server2 simultaneously. > My Bad ! Sorry ! Then, it could explain this : On the glusterfs volume : # wget http://imgsrc.hubblesite.org/hu/db/2007/16/images/a/formats/full_jpg.jpg 100% 209,780,268 11:32:12 (130.69 KB/s) - `full_jpg.jpg' saved On the local disk : # wget http://imgsrc.hubblesite.org/hu/db/2007/16/images/a/formats/full_jpg.jpg 100% 209,780,268 11:59:30 (4.05 MB/s) - `full_jpg.jpg' saved (It's a Hubble jpeg image 29566 x 14321, have a look ;)) Volume is configured as follow (where X is '1' or '2') : servers (192.168.28.5 and 192.168.28.6) : # cat /etc/glusterfs/glusterfs-serverX.vol volume brickX type storage/posix option directory /export end-volume volume trace type debug/trace subvolumes brickX option debug on end-volume volume serverX type protocol/server option transport-type tcp/server subvolumes brickX option auth.ip.brickX.allow 192.168.28.7 end-volume client (192.168.28.7) : # cat /etc/glusterfs/glusterfs-client.vol volume client1 type protocol/client option transport-type tcp/client option remote-host 192.168.28.5 option remote-subvolume brick1 end-volume volume client2 type protocol/client option transport-type tcp/client option remote-host 192.168.28.6 option remote-subvolume brick2 end-volume volume afr type cluster/afr subvolumes client1 client2 option replicate *:2 end-volume volume trace type debug/trace subvolumes afr option debug on end-volume So all machines are on the same 100MB network. It could not be a network issue. All are PIV HT 3GHz, 1Gb RAM, 250Gb SATA Disk, EXT3 volumes Question is : will the transfert speed decrease as the number of replica will increase (let's say X bricks and *:X replicate) ? > > >> And last but not least : let's now say that Client1 and Client2 run the >> same service (= access the same data). What would happen ? (Isn't that >> what you've called "split brain" ?) > > two clients accessing the same data at the same time is perfectly > safe. I do not see any problem here. or probably i did not understand > your question correctly. > We're in AFR. Then, I'm giving an example to try to explain it clearly : I have a volume with the text file "say_hello.txt". In it, you just have the line : "Your administrator says hello" Now, client1 and client2 open the file. A network failure occurs : client1 can only see server1 and client2 only see server2 (easily possible depending to your network architecture) client1 quickly adds a "hello!" line, saves and closes the file. Client2 takes his time. He writes "Thanks dear administrator, I'm client2 and I say hello to everyone who read this file" client2 saves and closes it. Network comes up again. That was my question's scenario. In the same way, let's take the same scenario, but without the network failure. What will happen ? (Data was modified before another client commits its modifications) Is that case let to the underlying FS (in "my" case, ext3) or will it be taken care of my any lock mechanism ? "2 clients" are for those two example "only", but what about the same cases with n-clients ? One typical (disastreous) scenario, that merges the two above, is when an evil one gain access to one brick, disconnects it from the net (cable unplug, or just service stop), modifies some data (injecting wrong data into some files, for instance) and reconnects the brick. The fsck mechanism will see that some files will would have been modified later than the other stored (and, why not, currently accessed by clients) and would try to commit the latest version (but still the wrong one) to other bricks, am I wrong ? I've fully understood the "power" of clients, and that's why I'm so paranoïd about them. Since only them will have the cluster's "full view" (knowing where is every bricks in their config) but seems to believe they are "alone" with it (that's how I understood the system), I'm really concerned about how they manage to work without disturbing one-another. Will there be a way to know if bircks are "synchronized" (same data replicated everywhere), which one is not and how severly, etc. ? (Maybe this will be included in the "server notification framework" translator?) >> I have another scenario, but I think it's enough for now, don't you ? > > more feedback is always apprecitated, please shoot. > Well, it concerns the clients again. Let's take back the scheme on the first post : Server1 / \ Client1 --- ---Client2 \ / Server2 Client1 is afr, client2 is unify. They share the same directories, the same files. Will there not be a problem? For this time being, I'm confident to say that every file created by client2 will not be seen by client1 because they will not be replicated. I'm aware that client2 will see every files Client1 has ever access twice. And finally, here is a question I asked on IRC, I will try to develop it: "In AFR mode, let's say that there is a client on every bricks. Will the AFR translator make the clients write "local" and then replicate, or will there be only one "write" node which will replicate to others ?" The replication is parallel, it writes at the same time. Remember the write performance I pasted at the begining, then. This would mean that the client which is writing something on the volume will see its writing slow down even if one of the bricks is on the same machine as it is. Am I correct ? You asked to shoot... ;) More to come, I'm afraid Enkahel Sebastien LELIEVRE slelievre@xxxxxxxxxxxxxxxx Services to ISP TBS-internet http://www.TBS-internet.com/