On 08/15/2012 02:15 PM, Emmanuel Dreyfus wrote: >> It's odd that the file even exists in both replica sets. > > It is a directory. Directory should be on all bricks, shound't they? Yes, they should. That clears up that particular mystery. > 1: volume gfs33-client-0 > 2: type protocol/client > 3: option remote-host silo > 4: option remote-subvolume /export/wd3a > (...) > 8: end-volume > 9: > 10: volume gfs33-client-1 > 11: type protocol/client > 12: option remote-host hangar > 13: option remote-subvolume /export/wd3a > (...) > 17: end-volume > 18: > 19: volume gfs33-client-2 > 20: type protocol/client > 21: option remote-host hangar > 22: option remote-subvolume /export/wd1a > (...) > 26: end-volume > 27: > 28: volume gfs33-client-3 > 29: type protocol/client > 30: option remote-host hotstuff > 31: option remote-subvolume /export/wd1a > (...) > 35: end-volume > 36: > 37: volume gfs33-replicate-0 > 38: type cluster/replicate > 39: subvolumes gfs33-client-0 gfs33-client-1 > 40: end-volume > 41: > 42: volume gfs33-replicate-1 > 43: type cluster/replicate > 44: subvolumes gfs33-client-2 gfs33-client-3 > 45: end-volume That all looks perfectly reasonable, which leaves us with a conundrum. If client-1 listed second in the replicate-0 definition then the 2 should be in the *second* column of the pending matrix regardless of what's going on with hosts/DNS. It's unclear either how we get a 2 in the first column or (without any "ignorant" bricks) we get another 1 anywhere. Maybe if you could look at the actual xattr values when the code enters afr_build_sources we could see what the pending matrix looks like before we start tweaking it. That at least divides the problem space into cases where we have the wrong value when we start and cases where we create a wrong value within the code. -- ObSig: if you use "ask" as a noun I will ignore you for a week.