I was just wondering if the self heal bug is planned to be fixed, or if they developers are just ignoring it in hopes it will go away? Everytime i ask someone privately if they can reproduce the problem on there own end, they go silent. (which leads me to believe that they in fact can reproduce it)
Very simple, AFR. As many subvolumes as you want. The first listed subvolume will always break the self heal. node2 and node3 always heal fine. Swap the ip address of the first listed subvolume and you will swap the box which breaks the selfheal. I have been able to repeat this bug every day with the newest git for the last month.
Please let us know if this is not considered a bug, or acknowledge it in some fashion. Thank you.
same configs
all nodes: killall glusterfsd; killall glusterfs;
all nodes: rm -rf /tank/*
all nodes: glusterfsd -f /usr/local/etc/glusterfs/glusterfsd.vol
all nodes: mount -t glusterfs /usr/local/etc/glusterfs/glusterfs.vol /gtank
node3:~# cp -R gluster /gtank/gluster1
*simulating a hardware failure
node1:~# killall glusterfsd ; killall glusterfs;
node1:~# killall glusterfsd ; killall glusterfs;
glusterfsd: no process killed
glusterfs: no process killed
node1:~# rm -rf /tank/*
*data never stops changing, just because we have a failed node
node3:~# cp -R gluster /gtank/gluster2
all nodes but node1:~# ls -lR /gtank/ | wc -l
2780
all nodes but node1:~# ls -lR /gtank/gluster1 | wc -l
1387
all nodes but node1:~# ls -lR /gtank/gluster2 | wc -l
1387
*Adding hardware back into the network after replacing bad harddrive(s)
node1:~# glusterfsd -f /usr/local/etc/glusterfs/glusterfsd.vol
node1:~# mount -t glusterfs /usr/local/etc/glusterfs/glusterfs.vol /gtank
node3:~# ls -lR /gtank/ | wc -l
1664
node3:~# ls -lR /gtank/gluster1 | wc -l
271
node3:~# ls -lR /gtank/gluster2 | wc -l
1387
### Export volume "brick" with the contents of "/tank" directory.
volume posix
type storage/posix # POSIX FS translator
option directory /tank # Export this directory
end-volume
volume locks
type features/locks
subvolumes posix
end-volume
volume brick
type performance/io-threads
subvolumes locks
end-volume
### Add network serving capability to above brick.
volume server
type protocol/server
option transport-type tcp
subvolumes brick
option auth.addr.brick.allow * # Allow access to "brick" volume
option client-volume-filename /usr/local/etc/glusterfs/glusterfs.vol
end-volume
#
#mirror block0
#
volume node1
type protocol/client
option transport-type tcp
option remote-host node1.ip # IP address of the remote brick
# option transport-timeout 30 # seconds to wait for a reply from server for each request
option remote-subvolume brick # name of the remote volume
end-volume
volume node2
type protocol/client
option transport-type tcp
option remote-host node2.ip # IP address of the remote brick
# option transport-timeout 30 # seconds to wait for a reply from server for each request
option remote-subvolume brick # name of the remote volume
end-volume
volume node3
type protocol/client
option transport-type tcp
option remote-host node3.ip # IP address of the remote brick
# option transport-timeout 30 # seconds to wait for a reply from server for each request
option remote-subvolume brick # name of the remote volume
end-volume
volume mirrorblock0
type cluster/replicate
subvolumes node1 node2 node3
option metadata-self-heal yes
end-volume
Gordan Bobic wrote:
First-access failing bug still seems to be present.
But other than that, it seems to be distinctly better than rc4. :)
Good work! :)
Gordan
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxx
http://lists.nongnu.org/mailman/listinfo/gluster-devel