On 11/11/2013 12:33 PM, Jeff Darcy wrote: > There's nothing about a split-network configuration like yours that > would cause something like this *by itself*, but anything that creates > greater complexity also creates new possibilities for something to go > wrong. Just to be safe, if I were you, I'd double- and triple-check the > DNS and /etc/hosts configurations on all machines to make sure some tiny > error didn't creep in. If your bricks are at the same paths on each > machine, it would be possible for a machine to think it's connecting to > one brick and actually end up connecting to another. I haven't even > been able to think through all of the ramifications, but just thinking > about how that might affect rebalance makes me a bit queasy. As far as I can tell, the /etc/hosts files and DNS are configured correctly. All four of the hosts with bricks have identical /etc/hosts files. I was very careful to double-check everything I'm including below before beginning anything, and I also checked it just now. 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 10.108.0.21 slc01dfs001a-pub.REDACTED.com slc01dfs001a-pub 10.108.0.22 slc01dfs001b-pub.REDACTED.com slc01dfs001b-pub 10.108.0.23 slc01dfs002a-pub.REDACTED.com slc01dfs002a-pub 10.108.0.24 slc01dfs002b-pub.REDACTED.com slc01dfs002b-pub 10.116.0.21 slc01dfs001a.REDACTED.com slc01dfs001a 10.116.0.22 slc01dfs001b.REDACTED.com slc01dfs001b 10.116.0.23 slc01dfs002a.REDACTED.com slc01dfs002a 10.116.0.24 slc01dfs002b.REDACTED.com slc01dfs002b The addresses that are on the -pub entries here are what is in DNS for the names without -pub, and the -pub names themselves do not exist in DNS. Here is what 'gluster volume status mdfs' says: Status of volume: mdfs Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick slc01dfs001a:/bricks/d00v00/mdfs 24025 Y 7739 Brick slc01dfs001b:/bricks/d00v00/mdfs 24025 Y 2547 Brick slc01dfs001a:/bricks/d00v01/mdfs 24026 Y 7744 Brick slc01dfs001b:/bricks/d00v01/mdfs 24026 Y 2552 Brick slc01dfs001a:/bricks/d00v02/mdfs 24027 Y 7750 Brick slc01dfs001b:/bricks/d00v02/mdfs 24027 Y 2558 Brick slc01dfs001a:/bricks/d00v03/mdfs 24028 Y 7756 Brick slc01dfs001b:/bricks/d00v03/mdfs 24028 Y 2564 Brick slc01dfs001a:/bricks/d01v00/mdfs 24029 Y 7762 Brick slc01dfs001b:/bricks/d01v00/mdfs 24029 Y 2570 Brick slc01dfs001a:/bricks/d01v01/mdfs 24030 Y 7768 Brick slc01dfs001b:/bricks/d01v01/mdfs 24030 Y 2576 Brick slc01dfs001a:/bricks/d01v02/mdfs 24031 Y 7774 Brick slc01dfs001b:/bricks/d01v02/mdfs 24031 Y 2582 Brick slc01dfs001a:/bricks/d01v03/mdfs 24032 Y 7780 Brick slc01dfs001b:/bricks/d01v03/mdfs 24032 Y 2588 Brick slc01dfs002a:/bricks/d00v00/mdfs 24017 Y 23691 Brick slc01dfs002b:/bricks/d00v00/mdfs 24017 Y 23802 Brick slc01dfs002a:/bricks/d00v01/mdfs 24018 Y 23696 Brick slc01dfs002b:/bricks/d00v01/mdfs 24018 Y 23807 Brick slc01dfs002a:/bricks/d00v02/mdfs 24019 Y 23702 Brick slc01dfs002b:/bricks/d00v02/mdfs 24019 Y 23813 Brick slc01dfs002a:/bricks/d00v03/mdfs 24020 Y 23708 Brick slc01dfs002b:/bricks/d00v03/mdfs 24020 Y 23819 Brick slc01dfs002a:/bricks/d01v00/mdfs 24021 Y 23714 Brick slc01dfs002b:/bricks/d01v00/mdfs 24021 Y 23825 Brick slc01dfs002a:/bricks/d01v01/mdfs 24022 Y 23720 Brick slc01dfs002b:/bricks/d01v01/mdfs 24022 Y 23831 Brick slc01dfs002a:/bricks/d01v02/mdfs 24023 Y 23726 Brick slc01dfs002b:/bricks/d01v02/mdfs 24023 Y 23837 Brick slc01dfs002a:/bricks/d01v03/mdfs 24024 Y 23732 Brick slc01dfs002b:/bricks/d01v03/mdfs 24024 Y 23843 NFS Server on localhost 38467 Y 21318 Self-heal Daemon on localhost N/A Y 21324 NFS Server on slc01nas2 38467 Y 49120 Self-heal Daemon on slc01nas2 N/A Y 49126 NFS Server on slc01nas1 38467 Y 12335 Self-heal Daemon on slc01nas1 N/A Y 12341 NFS Server on slc01dfs001b 38467 Y 5390 Self-heal Daemon on slc01dfs001b N/A Y 5396 NFS Server on slc01dfs002a 38467 Y 23740 Self-heal Daemon on slc01dfs002a N/A Y 23746 NFS Server on slc01dfs002b 38467 Y 23850 Self-heal Daemon on slc01dfs002b N/A Y 23856 The two hosts without bricks (slc01nas1 and slc01nas2) have only localhost entries in /etc/hosts. Here's gluster peer status from slc01dfs001a. From the other five hosts, it looks similar and they all say Connected. Number of Peers: 5 Hostname: slc01nas2 Uuid: 4bb5b123-7420-4b6c-a542-3b15fc2104f8 State: Peer in Cluster (Connected) Hostname: slc01nas1 Uuid: 1d087f2c-08b0-4de3-a547-c9e8f1255049 State: Peer in Cluster (Connected) Hostname: slc01dfs001b Uuid: 766a490a-132f-4baa-bf4c-193f49af3274 State: Peer in Cluster (Connected) Hostname: slc01dfs002a Uuid: 18a6936c-a721-49e2-82aa-fbe525986e25 State: Peer in Cluster (Connected) Hostname: slc01dfs002b Uuid: 5fd3e39d-dbb4-4f24-a3f7-3e0629839b2b State: Peer in Cluster (Connected) The iptables firewall and selinux are disabled on every host. Thanks, Shawn