Re: GlusterFS Example

Matt Paine <matt@xxxxxxxxxxxxxxxx> · Mon, 30 Jul 2007 09:22:57 +1000

Hi Nathan,

http://www.gluster.org/docs/index.php/GlusterFS_High_Availability_Storage_with_GlusterFS

If anyone hasthe time for a few questions:

I will do my best :)

1) In the past, my server configuration had only the local volume as in
http://www.gluster.org/docs/index.php/GlusterFS_Configuration_Example_for_Four_Bricks
Your config looks like each server has its volume and all the others.

This setup (i'm assuming you are talking about the high availability 
storage with glusterfs article) is specifically for high availability, 
if a brick goes down then there needs to be redundancy built in to 
provide a fallback option. If it was a simple unify solution, and a 
brick goes down, then you will be missing files (hence, not highly 
available at all unless you can guarantee 100% uptime of all bricks 
under all circumstances - which as far as I know noone offers :)

In that specific example, it has been chosen to provide every brick with 
 a copy of the files. In this case two bricks could fail and the 
cluster would still be up and running :). Extending this to 10 bricks, 
then you could potentially have 9 bricks fail and it would still be 
working.

Point being, GlusterFS is highly configurable, you can set it up any way 
you like, more redundancy (afr), or more space (unify) or both. Your choice.

2) Why do you use loopback IP vs the public IP in the first example?

The first example introduces itself with this paragraph....

--------8<---------------------------------
In this article, I will show you how to configure GlusterFS to simulate 
a cluster with:

    * one single mount point
    * 4 storage nodes serving the files

I said simulate since we will deploy such a cluster on only one 
computer, but you can as well set it up over physical different 
computers by changing the IP addresses.
------------------------->8----------------

Its only a simulation, run on one computer. If you wish to use the 
example over an actual cluster of computers, you will need to change the 
IP numbers accordingly.

3) Can I change option replicate *:3 to *:2 to just have two copies? I
like having an extra copy, but in my application having 3 or 3 note or 4
for 4 node is a bit overkill.

Yes. As I said its highly configurable. You can even choose to replicate 
only certain files. E.g.

option replicate *.tmp:1,*.html:4,*.php:4,*.db:1,*:2

In this case any .tmp or .db files will not get replicated (will sit on 
the first afr brick only), all html and php files will be mirrored 
across the first four bricks of afr, and all other files will be 
mirrorred across the first two bricks. (note all on one line, note comma 
delimited, note also no spaces before or after the comma).

4) In my application the server is also the client, in my current config
when one out of 3 servers is done I can no longer write, even tho I am
telling my config to write to the local brick. Is there any way to fix
this? If a note is down I understand how I would not have access to the
files, but I don't want to take down the full cluster.

I'm note sure sorry. Maybe one of the devs can clarify that. I know 
there was an issue with an earlier relase to do with AFR. When the first 
brick was down then afr was no longer available.

P.S. Our setup consists of 3 servers each with a 4 port video card. They
continually capture video from the 4 inputs and save them in /share. Each
box also acts as a video server accepting connections and reading any file
in /share. I don't care much about redundancy, but I do care that /share
has files from all servers, or all servers that are up anyway.

My server.vol file, only change is the bind IP address.

volume brick
        type storage/posix
        option directory /md0
end-volume

volume server
        type protocol/server
        option transport-type tcp/server
        option listen-port 6996
        option bind-address 192.168.0.60
        subvolumes brick
        option auth.ip.brick.allow 192.168.0.*
end-volume

My client.vol file, the only change is option nufa.local-volume-name
client0 changes to client1 or client2 for each box. The above works as
long as all 3 servers are up, if one is down I can no longer read or write
to /share. I found that I could continue to write to /md0 on each server
and it still was replicated correctly, however if any server is down reads
on /share would break. : (

Could you post your client spec file? That would make it clearer to me. 
I have a feeling you are using separate client files for each of the 
clients, perhaps it would be easier for you to use a single spec file 
for your entire cluster, but we could clarify that if you post your 
client file :)

Any help would rock!

I hope that rocked at least a little :p

Matt.