hi,
Attaching the steps to re-create the issue.
As part of Entry transaction, before performing
create/mknod/mkdir/rmdir/unlink/link/symlink/rename fops, afr takes
appropriate entry locks and then performs pre-op. If the fop fails on
all nodes then the changelog leaves the directory in 'FOOL' state.
Because of this the subsequent self-heal will be conservative merge,
which may bring back the files that were already deleted leading to
duplicate entries across distribute subvolumes.
We need to improve afr-transaction to handle this case gracefully.
Sending this mail to start the discussion towards a solution. Please
feel free to contribute.
Pranith.
#!/bin/bash
. $(dirname $0)/../include.rc
cleanup;
BRICK_COUNT=6
TEST glusterd
TEST pidof glusterd
TEST $CLI volume create $V0 $H0:$B0/${V0}0 $H0:$B0/${V0}1 $H0:$B0/${V0}2 $H0:$B0/${V0}3
TEST $CLI volume set $V0 brick-log-level DEBUG
TEST $CLI volume set $V0 client-log-level DEBUG
TEST $CLI volume set $V0 metadata-self-heal off
TEST $CLI volume set $V0 self-heal-daemon off
TEST $CLI volume start $V0
## Mount FUSE
TEST glusterfs -s $H0 --volfile-id $V0 $M0;
sleep 5;
TEST mkdir $M0/dir{1..10};
#TEST cp /bin/* $M0/;
TEST touch $M0/dir{1..10}/files{1..100};
# add a brick process
TEST $CLI volume add-brick $V0 $H0:$B0/${V0}4 $H0:/$B0/${V0}5
TEST $CLI volume rebalance $V0 start force
kill -9 `cat /var/lib/glusterd/vols/$V0/run/$H0-d-backends-${V0}0.pid`;
sleep 50;
TEST $CLI volume start $V0 force;
function trigger_self_heal {
find $M0 |xargs stat 1>/dev/null;
}
TEST trigger_self_heal
function test_num_files_100 {
[ `ls $1 | wc -l` = 100 ]
}
TEST $M0/dir1
TEST $M0/dir2
TEST $M0/dir3
TEST $M0/dir4
TEST $M0/dir5
TEST $M0/dir6
TEST $M0/dir7
TEST $M0/dir8
TEST $M0/dir9
TEST $M0/dir10