I had a two-node replicated/distributed volume, spread across server1:/bricks/1 server2:/bricks/1 server1:/bricks/2 server2:/bricks/2. I powered down server2 in order to re-rack it to make room for server3. server2 fails to come up, for reasons having nothing to do with gluster. So I decided to go ahead and bring up server3 and move server2's bricks to it. I saw conflicting information on how to do that with a completely dead node and a new node of a different name. Basically i did a peer probe server3, then volume replace-brick share name server2:/bricks/1 server3:/bricks/1. then i did a volume replace-brick <blah> commit force. this was probably a bad thing. then i tried to do the replace-brick with the second set. it fails to start saying replace-brick is already running on the volume. now i'm stuck. the data in brick/1 DOES appear on the new node, but i can't do anything with brick/2. if i try to do a commit, it says bricks/1 isn't on server2, and if i try to do anything else it says replace-brick is running. i did a rebalance, hoping that would fix it, but it has not. I attempted to stop the volume, but it said i couldn't until the replace-brick was committed or aborted. I cannot abort, it says replace-brick abort failed. Now what? Mind, this is a temporary setup which has a complex directory structure, but no data as yet. We are looking to use this for production VERY soon, and i'm not sure that (a) i have time to rebuild everything, and (and more importantly) (b) i need to be able to demonstrate to management that "look, a node failed and we replaced it with no data loss". so, what's my next step to get this mess untangled, and the data safely on my new node...