On 06/23/2011 12:38 PM, Burnash, James wrote: > Well, it took me 3 reads (1 out loud) to process what was going in > your explanation - but then it all made sense :-) Sorry about that. People have told me I can be a bit cryptic when I've had too much caffeine. Unfortunately I'm worse when I haven't had enough. ;) > As to your question about xattributes on those 4 directories at the > end of your message: > > fs18/g01/pfs-ro1-client-1 getfattr -d -e hex -m - > /export/read-only/g01 getfattr: Removing leading '/' from absolute > path names # file: export/read-only/g01 > trusted.afr.pfs-ro1-client-0=0x000000000600000800000000 > trusted.afr.pfs-ro1-client-1=0x000000000000000000000000 > trusted.gfid=0x00000000000000000000000000000001 > trusted.glusterfs.dht=0x0000000100000000333333303ffffffb > trusted.glusterfs.test=0x776f726b696e6700 OK, so much for that theory. The xattrs for the nodes themselves do seem to be present and valid, so we wouldn't get past the first step of the sequence I described. What I do know is that we're getting to a place in the code (we call afr_sh_wise_nodes_conflict and it returns TRUE) where all of the nodes seem to be "accusing" each other even though no such pattern is apparent from the xattrs. If the xattrs are there, then maybe we're looking at the wrong ones. Does fs18 know that its g01 is "client-1" in the volfile, and thus should be looking at trusted.afr.pfs-ro1-client-1 for itself? Or does it think it should be looking at trusted.afr.pfs-ro1-client57 - which obviously isn't there? I guess there might be cases where that could happen because of inconsistent volfiles, perhaps some kind of DNS issues, maybe even bugs in the code. Maybe looking for "pending_key" in a state dump (what you get if you send SIGUSR1 to glusterfsd) would reveal any such inconsistencies. Yeah, I'm grasping at straws here. It's just not making sense.