Re: a union to two stripes to fourteen mirrors...

Kevan Benson <kbenson@xxxxxxxxxxxxxxx> · Tue, 20 Nov 2007 10:13:01 -0800

Jerker Nyberg wrote:

Hi,

I'm trying out different configurations of GlusterFS. I have 7 nodes 
each with two 320 GB disks where 300 GB om each disk is for the 
distributed file system.

Each node is called N. Every file system is on the server side mirrored 
to the other disk on the next node, wrapped around so that the last node 
mirrors its disk to the first. invented. The real config is included in 
the end of this mail.

Pseudodefinitions:

fs(1) = a file system on the first disk
fs(2) = a file system on the second disk
n(I, fs(J)) = the fs J on node I
afr(N .. M) = mirror the volumes
stripe(N .. M) = stripe the volumes

Server:

Forw(N) = afr(fs(1), node(N+1, fs(2))
Back(N) = afr(fs(2), node(N-1, fs(1))

Client:

FStr(N .. M) = stripe(n(N, Forw(N)) .. n(N+i, Forw(N+1)) .. n(M, Forw(M)))
BStr(N .. M) = stripe(n(N, Back(N)) .. n(N+i, Back(N+1)) .. n(M, Back(M)))
mount /glusterfs = union(FStr(1 .. 7), BStr(1..7))

The goal was to get good performance but also redundancy. But this setup 
will not will it? The stripes will not work when a part of is gone and 
the union will not not magically find the other part of a file on the 
other stripe? And where to put the union namespace for good performance?

But my major question is this: I tried to stripe a single stripe (not 
using union on the client, just striping on the servers which in turn 
mirrored) When rsync'ing in data on it on a single server things worked 
fine, but when I put some load on it from the other nodes (dd'ing in and 
out some large files) the glusterfsd's on the first server died... Do 
you want me to check this up more and try to reproduce and narrow down 
the problem, or is this kind of setup in general not a good idea?

I don't think the developers would ever consider a situation where 
glusterfsd dies acceptable, so I'm sure they would want info on why it dies.

I can't seem to correlate your spec files with the layout you gave a 
above.  My understanding of your spec files makes it look like this to me:

Server:

Forw(N) = afr(fs(1), node(N+1, fs(2))

Client:

FStr(N .. M) = stripe(n(1, Forw(1)) .. n(N, Forw(N)))

In either case you have a problem if you are looking for high 
availability, as the loss of any single node will reduce the cluster to 
an unusable state (unless striping does some stuff I don't know about). 
 Since each AFR is defined on the main node itself, the client doesn't 
know it's actually putting files on node N and N+1, it's only talking to 
N.  If N dies, the client has no way of knowing it can still read and 
write from the other AFR member, N+1.  Moving the AFR config to the 
client would fix this.

Maybe that's why you had the Back(N) stuff, but I'm not sure I 
understand what you were trying to say with that...  Were you trying to 
define the same afr share through two different servers with two 
different (but equivalent) configs?

--

-Kevan Benson
-A-1 Networks