Hello everyone, I am very new to GlusterFS but I finally got a testing environment set up based on your rich documentation by following the most interesting examples. I am very excited of GlusterFS because it seems to be exactly what I am searching for. For now, I tried out how the afr, unify and self healing mechanism is working and I am getting some unexpected results - at least from my point of view. Prolog: I am running GlusterFS from the debian etch packages that are available on the site which is GlusterFS 1.3.8-0pre2 with Fuse 2.7.2-glfs8-0 (great packages by the way, works perfectly). I basically set up two servers that replicate each other and added the unify translator to it for self healing. There is also a client running an each server mounting the filesystem that is exported by the local server (which is because auf afr the same on both). The structure for the servers basically is (based on the GlusterFS 1.3 P2P Cluster with Auto Healing<http://www.gluster.org/docs/index.php/GlusterFS_1.3_P2P_Cluster_with_Auto_Healing> example but somewhat simpler without all the raid stuff): www1-ds = local ds www1-ns = local ns www2-ds = remote ds www2-ns = remote ns www-ds-afr = afr: www1-ds and www2-ds www-ns-afr = afr: www1-ns and www2-ns www = unify: www-ds-afr (as storage) and www-ns-afr (as namespace) The same inversed for the second server node. This setup works fine, all data is replicated to both servers, clients can mount it etc. and there are no log messages appearing, so I assume everything is ok. I also uploaded my configuration files to http://daniel.users.hostunity.net/glusterfs/, maybe this helps. However, I tested the self healing mechanism and there seems to go something wrong: First, when both servers are up, I create a file named "before". This file is replicated correctly. Then I kill (killall -9 glusterfsd) one of the two servers to simulate a crash. Afterwards I delete the "before" file on a client connected to the still working server and create a new one named "after". This file is correctly saved in the exported directory and namespace on the still working server. Then, I start up the other server node again and watch what happens, which is somewhat unexpected: When accessing the second server again with some client, the glusterfsd process dies and also deletes all contents of the local export directory (which is basically the "before" file) and also deletes the complete local namespace - all data on the node that is still up and did not crash is still intact, so no damage is done. When starting up the crash-simulated server once more, it does not crash anymore. But there are no files remaining, the exported directory and the namespace is empty. Doing a "ls -al" or similar does not show any files. Also running "find > /dev/null" or "ls -lR" on both servers did not bring up any files. However, when accessing the percieved non-existing "after" file with a client connected to the previously crash-simulated server, it is "healed" and also shows up in "ls" again. However I did not find a way to do self healing of files that are not explicitly accessed which will result in chaos when the other server crashes the next time because those files will be lost. Of course I could stop the whole cluster causing some global downtime everytime a crashed server returns and synchronize the crashed server on my own using rsync or similar, but is this intended? This also happens when I create the files inside an other directory than the parent. So my questions are: Is this behaviour expected for "self healing"? Is my config invalid? Why does glusterfsd on the crash-simulating server node crash (shutdown?) the first time it restarts? Basically: What am I doing wrong? :) Hope you can point me into the right direction, thanks in advance Daniel