Re: Testing failover and recovery

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

Interesting, we seems to be several users with issues regarding recovery but there is no to little replies... ;-)

I did some more testing over the weekend. Same initial workload (two glusterfs servers, one client that continuesly
updates a file with timestamps) and then two easy testcases:

1. one of the glusterfs servers is constantly rebooting (just a initscript that sleeps for 60 seconds before issuing "reboot")

2. similar to 1 but instead of rebooting itself, it is rebooting the other glusterfs server so that the result is that they a server
    comes up, wait for a bit and then rebooting the other server.

During the whole weekend this has progressed nicely. The client is running all the time without issues and the glusterfs
that comes back (either only one or one of the servers, depending on the testcase shown above) is actively getting into
sync and updates it's copy of the file.

So it seems to me that we need to look deeper in the recovery case (of course, but it is interesting to know about the
nice&easy usescases as well). I'm surprised that the recovery from a failover (to restore the rendundancy) isn't getting
higher attention here. Are we (and others that has difficulties in this area) running a unusual usecase?

BR,
Per


On Wed, Dec 4, 2013 at 12:17 PM, Per Hallsmark <per@xxxxxxxxxxxx> wrote:
Hello,

I've found GlusterFS to be an interesting project. Not so much experience of it
(although from similar usecases with DRBD+NFS setups) so I setup some
testcase to try out failover and recovery.

For this I have a setup with two glusterfs servers (each is a VM) and one client (also a VM).
I'm using GlusterFS 3.4 btw.

The servers manages a gluster volume created as:

gluster volume create testvol rep 2 transport tcp gs1:/export/vda1/brick gs2:/export/vda1/brick
gluster volume start testvol
gluster volume set testvol network.ping-timeout 5

Then the client mounts this volume as:
mount -t glusterfs gs1:/testvol /import/testvol

Everything seems to work good in normal usecases, I can write/read to the volume, take servers down and up again etc.

As a fault scenario, I'm testing a fault injection like this:

1. continuesly writing timestamps to a file on the volume from the client. It is automated in a smaller testscript like:
:~/glusterfs-test$ cat scripts/test-gfs-client.sh 
#!/bin/sh

gfs=/import/testvol

while true; do
date +%s >> $gfs/timestamp.txt
ts=`tail -1 $gfs/timestamp.txt`
md5sum=`md5sum $gfs/timestamp.txt | cut -f1 -d" "`
echo "Timestamp = $ts, md5sum = $md5sum"
sleep 1
done
:~/glusterfs-test$

As can be seen, the client is a quite simple user of the glusterfs volume. Low datarate and single user for example.


2. disabling ethernet in one of the VM (ifconfig eth0 down) to simulate like a broken network

3. After a short while, the failed server is brought alive again (ifconfig eth0 up)

Step 2 and 3 is also automated in a testscript like:

:~/glusterfs-test$ cat scripts/fault-injection.sh 
#!/bin/sh

# fault injection script tailored for two glusterfs nodes named gs1 and gs2

if [ "$HOSTNAME" == "gs1" ]; then
peer="gs2"
else
peer="gs1"
fi

inject_eth_fault() {
echo "network down..."
ifconfig eth0 down
sleep 10
ifconfig eth0 up
echo "... and network up again."
}

recover() {
echo "recovering from fault..."
service glusterd restart
}

while true; do
sleep 60
if [ ! -f /tmp/nofault ]; then
if ping -c 1 $peer; then
inject_eth_fault
recover
fi
fi
done
:~/glusterfs-test$


I then see that:

A. This goes well first time, one server leaves the cluster and the client hang for like 8 seconds before beeing able to write to the volume again.

B. When the failed server comes back, I can check that from both servers they see each other and "gluster peer status" shows they believe the other is in connected state.

C. When the failed server comes back, it is not automatically seeking active participation on syncing volume etc (the local storage timestamp file isn't updated).

D. If I do restart of glusterd service (service glusterd restart) the failed node seems to get back like it was before. Not always though... The chance is higher if I have long time between fault injections (long = 60 sec or so, with a forced faulty state of 10 sec)
With a period time of some minutes, I could have the cluster servicing the client OK for up to 8+ hours at least.
Shortening the period, I'm easily down to like 10-15 minutes.

E. Sooner or later I enter a state where the two servers seems to be up, seeing it's peer (gluster peer status) and such but none is serving the volume to the client.
I've tried to "heal" the volume in different way but it doesn't help. Sometimes it is just that one of the timestamp copies in each of
the servers is ahead which is simpler but sometimes both the timestamp files have added data at end that the other doesnt have.

To the questions: 

* Is it so that from a design point of perspective, the choice in the glusterfs team is that one shouldn't rely soley on glusterfs daemons beeing able to  recover from a faulty state? There is need for cluster manager services (like heartbeat for example) to be part? That would make experience C understandable and one could then take heartbeat or similar packages to start/stop services.

* What would then be the recommended procedure to recover from a faulty glusterfs node? (so that experience D and E is not happening)

* What is the expected failover timing (of course depending on config, but say with a give ping timeout etc)?
  and expected recovery timing (with similar dependency on config)?

* What/how is glusterfs team testing to make sure that the failover, recovery/healing functionality etc works?

Any opinion if the testcase is bad is of course also very welcome.

Best regards,
Per

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux