Jerker Nyberg wrote:
Hi,
I'm trying out different configurations of GlusterFS. I have 7 nodes
each with two 320 GB disks where 300 GB om each disk is for the
distributed file system.
Each node is called N. Every file system is on the server side mirrored
to the other disk on the next node, wrapped around so that the last node
mirrors its disk to the first. invented. The real config is included in
the end of this mail.
Pseudodefinitions:
fs(1) = a file system on the first disk
fs(2) = a file system on the second disk
n(I, fs(J)) = the fs J on node I
afr(N .. M) = mirror the volumes
stripe(N .. M) = stripe the volumes
Server:
Forw(N) = afr(fs(1), node(N+1, fs(2))
Back(N) = afr(fs(2), node(N-1, fs(1))
Client:
FStr(N .. M) = stripe(n(N, Forw(N)) .. n(N+i, Forw(N+1)) .. n(M, Forw(M)))
BStr(N .. M) = stripe(n(N, Back(N)) .. n(N+i, Back(N+1)) .. n(M, Back(M)))
mount /glusterfs = union(FStr(1 .. 7), BStr(1..7))
The goal was to get good performance but also redundancy. But this setup
will not will it? The stripes will not work when a part of is gone and
the union will not not magically find the other part of a file on the
other stripe? And where to put the union namespace for good performance?
But my major question is this: I tried to stripe a single stripe (not
using union on the client, just striping on the servers which in turn
mirrored) When rsync'ing in data on it on a single server things worked
fine, but when I put some load on it from the other nodes (dd'ing in and
out some large files) the glusterfsd's on the first server died... Do
you want me to check this up more and try to reproduce and narrow down
the problem, or is this kind of setup in general not a good idea?
I don't think the developers would ever consider a situation where
glusterfsd dies acceptable, so I'm sure they would want info on why it dies.
I can't seem to correlate your spec files with the layout you gave a
above. My understanding of your spec files makes it look like this to me:
Server:
Forw(N) = afr(fs(1), node(N+1, fs(2))
Client:
FStr(N .. M) = stripe(n(1, Forw(1)) .. n(N, Forw(N)))
In either case you have a problem if you are looking for high
availability, as the loss of any single node will reduce the cluster to
an unusable state (unless striping does some stuff I don't know about).
Since each AFR is defined on the main node itself, the client doesn't
know it's actually putting files on node N and N+1, it's only talking to
N. If N dies, the client has no way of knowing it can still read and
write from the other AFR member, N+1. Moving the AFR config to the
client would fix this.
Maybe that's why you had the Back(N) stuff, but I'm not sure I
understand what you were trying to say with that... Were you trying to
define the same afr share through two different servers with two
different (but equivalent) configs?
--
-Kevan Benson
-A-1 Networks