Afr handling dir fop failures on all nodes gracefully.

Pranith Kumar K <pkarampu@xxxxxxxxxx> · Wed, 09 Jan 2013 17:58:06 +0530

hi,
         Attaching the steps to re-create the issue.
As part of Entry transaction, before performing 
create/mknod/mkdir/rmdir/unlink/link/symlink/rename fops, afr takes 
appropriate entry locks and then performs pre-op. If the fop fails on 
all nodes then the changelog leaves the directory in 'FOOL' state. 
Because of this the subsequent self-heal will be conservative merge, 
which may bring back the files that were already deleted leading to 
duplicate entries across distribute subvolumes.

We need to improve afr-transaction to handle this case gracefully.

Sending this mail to start the discussion towards a solution. Please 
feel free to contribute.

Pranith.
#!/bin/bash

. $(dirname $0)/../include.rc

cleanup;

BRICK_COUNT=6

TEST glusterd
TEST pidof glusterd

TEST $CLI volume create $V0 $H0:$B0/${V0}0 $H0:$B0/${V0}1 $H0:$B0/${V0}2 $H0:$B0/${V0}3 
TEST $CLI volume set $V0 brick-log-level DEBUG
TEST $CLI volume set $V0 client-log-level DEBUG
TEST $CLI volume set $V0 metadata-self-heal off
TEST $CLI volume set $V0 self-heal-daemon off
TEST $CLI volume start $V0

## Mount FUSE
TEST glusterfs -s $H0 --volfile-id $V0 $M0;
sleep 5;

TEST mkdir $M0/dir{1..10};
#TEST cp /bin/* $M0/;
TEST touch $M0/dir{1..10}/files{1..100};

# add a brick process
TEST $CLI volume add-brick $V0 $H0:$B0/${V0}4 $H0:/$B0/${V0}5

TEST $CLI volume rebalance $V0 start force
kill -9 `cat /var/lib/glusterd/vols/$V0/run/$H0-d-backends-${V0}0.pid`;
sleep 50;

TEST $CLI volume start $V0 force;

function trigger_self_heal {
        find $M0 |xargs stat 1>/dev/null;
}
TEST trigger_self_heal
function test_num_files_100 {
        [ `ls $1 | wc -l` = 100 ]
}

TEST $M0/dir1
TEST $M0/dir2
TEST $M0/dir3
TEST $M0/dir4
TEST $M0/dir5
TEST $M0/dir6
TEST $M0/dir7
TEST $M0/dir8
TEST $M0/dir9
TEST $M0/dir10