On 04/11/2012 07:00 AM, Alex Florescu wrote: > Simulation follows: > step 1 > node1: > iptables -I INPUT 1 -s 10.0.2.15 -j DROP (connectivity loss simulation) > touch /a/howareyou > > node2: > touch /a/hello > > step 2 > node1: > iptables -D INPUT 1 (connectivity recovery) > ls /a > ls: cannot access /a: Input/output error > > node2: > ls /a > ls: cannot access /a: Input/output error I was able to reproduce this on my own setup using packages built from git, which has a bit of a surprise TBH. I'll look into it, but here are some observations that might suggest workarounds. (1) To a first approximation, it should be safe to "merge" directory contents despite there being a split-brain problem, by healing any file that exists on only one brick from there to its peer(s). This contrasts with the case for file contents, where - as Robert points out - we can't determine the correct thing to do and would risk overwriting data. Directory entries differ from file contents in a small but important way: they're sets, not arrays. If something's not in the set, there's no danger that adding it will overwrite anything. (2) That said, the case you've created is indistinguishable from the case where "hello" and "howareyou" used to exist on both bricks and each *deleted* one while they couldn't communicate. Unconditionally recreating the files would effectively undo those deletes, which many would consider an error as serious as overwriting data. It would not be valid for such merge behavior to kick in unconditionally. At the very least, there should be a configuration option for it. (3) The reason you continue to get I/O errors is probably that the xattrs on the *parent directory* still indicate pending operations on both sides. You can verify this with the following command on each brick: getfattr -d -e hex -n trusted.glusterfs.dht /a The format of this value is described here: http://hekafs.org/index.php/2011/04/glusterfs-extended-attributes/ If the result is non-zero (most likely in the last four-byte integer indicating a directory-entry operation) then that confirms our theory. It should be safe for the self-heal code to clear these counts if (and only if) the directories are checked and found identical. In fact, I think we already do this. Thus, manual copying of files followed by self-heal on the parent directory should make the errors go away. I encourage you to try that while I go look at the code.