Re: Client side AFR race conditions?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hmm... So you are saying the problem is writing without locking?
Should writing to a file not involve an implicit lock, regardles of flock?

Gordan

On Tue, 6 May 2008, Martin Fick wrote:

--- Anand Babu Periasamy <ab@xxxxxxxxxx> wrote:

I really want to understand the issue and help you
out. We always have heated discussions even in our
labs. We only take it positively :) Your feedback is

very valuable to us.

No prob!  I appreciate it.


Martin Fick wrote:
In other words, what prevents conflicts when
client A & B both write to the same file?  Could
A's write to subvolume A succeed before B's write
to subvolume A, and at the same time B's write to
subvolume B succeed before A's write to subvolume
B?


OK, just for giggles I created a test script to
attempt to replicate this theoretical problem.  I was
in fact able to do so fairly easily, in fact, more
easily than I might have hoped!

What I wrote was a simple shell script which attempts
to write to a file at the same time from two different
processes.  Since this is actually hard to time with a
shell script, I did not expect very good results.  A C
program is likely to create much greater contention
possibilities.  The script simply writes once from
each process to a file and then increments the
filename counter and starts over with the next file.
This should perform one 20 character write only from
each process to each file.  I have attached the test
script (stress).  I run the script with the following
options:

 ./stress /mnt -d /mnt2 -c 100

This tells it to perform the test on 100 files and
specifies the 2 different mount points.


As for my glusterfs setup, I use two client afr mounts
on the same machine /mnt and /mnt2.  As noted above,
the script is run so that each process points to a
different client mount.  The server runs on the same
machine and maps the client subvolumes to /export/a
and /export/b, configs below.


Here are the split brain counts for a few successive
runs of 100 file writes:
4,8,6,7,6,10,18,7,0,5,4,9,10,12.  Not good!  That
means that it is actually very easy to create this
problem, at least in the single digits percentage
wise.

I use this simple command to get those results:

 diff -q /export/a /export/b |wc -l

This command compares the two different subvolumes.  I
have looked at the file themselves, and yes they are
different, either lots of AAAs or lots of BBBs.

I hope this helps you debug this race condition a
little,


-Martin



I am using debian, packages:

 glusterfs client and server 1.3.8-0pre
 fuse-utils and libfuse  2.7.2-glfs
 linux kernel 2.6.17-1-vserver-686


Server:

volume a
 type storage/posix
 option directory /export/a
end-volume

volume b
 type storage/posix
 option directory /export/b
end-volume

volume afr
 type cluster/afr
 subvolumes a b
end-volume

volume server
 type protocol/server
 option transport-type tcp/server
 option bind-address 192.168.1.75

 subvolumes a b

 option auth.ip.a.allow *
 option auth.ip.b.allow *
end-volume


Client:


volume ca
 type protocol/client
 option transport-type tcp/client
 option remote-host 192.168.1.75
 option remote-subvolume a
end-volume

volume cb
 type protocol/client
 option transport-type tcp/client
 option remote-host 192.168.1.75
 option remote-subvolume b
end-volume

volume afr
 type cluster/afr
 subvolumes ca cb
end-volume




stress:


#!/bin/bash

trap clean HUP INT QUIT TERM

clean()
{
 kill $A $B 2>/dev/null
 wait $A $B
 rm "$A2B" "$B2A" "$P" 2>/dev/null
 exit 1
}

writei()
{  # dir val rec snd
 typeset  dir="$1" val="$2" rec="$3" snd="$4"
 typeset -i i=0

 while  read i < "$rec" ; do
   [ ! -z "$SW_P" ]  && read  < "$P"
   echo "$val" > "$dir/$i"  # we follow
   i=$(($i +1))

   [ ! -z "$SW_S" ]  && sleep 1

   [ ! -z "$C"  -a  $i -gt $C ] && return
   echo $i > "$snd"
   [ ! -z "$C"  -a  $i -eq $C ] && return
   echo "$val" > "$dir/$i"  # we initiate
 done
}

D="$1"
D2="$D"

COM=/tmp/$(basename $0)
A2B="$COM.a2b"
B2A="$COM.b2a"
P="$COM.prompt"
mkfifo "$A2B" "$B2A"

VA=AAAAAAAAAAAAAAAAAAAAA
VB=BBBBBBBBBBBBBBBBBBBBB

while [ $# -gt 0 ] ; do
 case "$1" in
   -p|--prompt) SW_P="-p" ; mkfifo "$P" ;;
   -s|--sleep)  SW_S="-s" ;;

   -m|--manual) shift ; M="$1" ;;
   -d) shift ; D2="$1" ;;
   -c|--count) shift ; C="$1" ;;
   -a) shift ; VA="$1" ;;
   -b) shift ; VB="$1" ;;
 esac
shift ; done


writei "$D" "$VA" "$B2A$M" "$A2B" &
A=$!

writei "$D2" "$VB" "$A2B$M" "$B2A" &
B=$!

echo 0 > "$B2A"

if [ ! -z "$SW_P" ] ; then
 while true ; do  read ; echo > "$P" ; done
fi

wait $A $B
clean




     ____________________________________________________________________________________
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ


_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxx
http://lists.nongnu.org/mailman/listinfo/gluster-devel





[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux