Trying this command to remove the brick on the failed node: [root@lme-fw2 ~]# gluster volume remove-brick firewall-scripts 192.168.253.1:/firewall-scripts Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y [root@lme-fw2 ~]# But running top in another window - I see 97.8%wa with 0.0%id. This suggests this system is spending all its time idle, waiting for disk I/Os to complete. Even after I removed the brick associated with the dead node. And gluster volume info has been hung for the past several minutes. After 5+ minutes, it finally tells me no volumes present. So what happened to the volumes I set up? But check this out: [root@lme-fw2 ~]# gluster volume info No volumes present [root@lme-fw2 ~]# In another window, I cd /firewall-scripts and look at a file. This is my gluster volume. Then I do this again: [root@lme-fw2 ~]# [root@lme-fw2 ~]# gluster volume info Volume Name: firewall-scripts Type: Replicate Status: Started Number of Bricks: 2 Transport-type: tcp Bricks: Brick1: 192.168.253.1:/gluster-fw1 Brick2: 192.168.253.2:/gluster-fw2 Options Reconfigured: network.ping-timeout: 5 [root@lme-fw2 ~]# And now my volume shows up. With both bricks. What's up with that? I removed the old brick but now it's here. I also set my ping-timeout to 5 seconds something like an hour ago. So trying to remove the brick again... At least it generates some output this time: [root@lme-fw2 ~]# gluster volume remove-brick firewall-scripts 192.168.253.1:/firewall-scripts Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y Incorrect brick 192.168.253.1:/firewall-scripts for volume firewall-scripts [root@lme-fw2 ~]# Ah - my brick name is wrong. Trying again with the correct brick name.... Uh-oh! [root@lme-fw2 ~]# [root@lme-fw2 ~]# gluster volume remove-brick firewall-scripts 192.168.253.1:/gluster-fw1 Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y Remove Brick successful [root@lme-fw2 ~]# [root@lme-fw2 ~]# gluster volume info And we're hung. When I tell the surviving node to take out the brick from the failed node, why does Gluster on the surviving node hang??? [root@lme-fw2 firewall-scripts]# ls ls: cannot open directory .: Transport endpoint is not connected [root@lme-fw2 firewall-scripts]# pwd /firewall-scripts [root@lme-fw2 firewall-scripts]# [root@lme-fw2 firewall-scripts]# ls [root@lme-fw2 firewall-scripts]# ls ls: cannot open directory .: Transport endpoint is not connected [root@lme-fw2 firewall-scripts]# pwd /firewall-scripts [root@lme-fw2 firewall-scripts]# [root@lme-fw2 firewall-scripts]# ls ls: cannot access allow-all-with-nat: Transport endpoint is not connected ls: cannot access rc.firewall: Transport endpoint is not connected ls: cannot access rcfirewall.conf: Transport endpoint is not connected ls: cannot access make-virgin.sh: Transport endpoint is not connected ls: cannot access start-failover-monitor.sh: Transport endpoint is not connected ls: cannot access failover-monitor.sh: Transport endpoint is not connected ls: cannot access rcfirewall.conf-20120201: Transport endpoint is not connected ls: cannot access rc.firewall-20120201: Transport endpoint is not connected ls: cannot access fwdate.txt: Transport endpoint is not connected ls: cannot access rcfirewall.conf-20120210: Transport endpoint is not connected ls: cannot access rcfirewall.conf-20120302: Transport endpoint is not connected ls: cannot access rc.firewall-20120302: Transport endpoint is not connected ls: cannot access failover-monitor.sh-20120406: Transport endpoint is not connected ls: cannot access rc.firewall-20120704: Transport endpoint is not connected ls: cannot access rcfirewall.conf-20120704: Transport endpoint is not connected ls: cannot access initial_rc.firewall-20120708: Transport endpoint is not connected ls: cannot access =: Transport endpoint is not connected ls: cannot access append.txt: Transport endpoint is not connected ls: cannot access rc.firewall-20120708: Transport endpoint is not connected ls: cannot access rcfirewall.conf-20120708: Transport endpoint is not connected ls: reading directory .: Transport endpoint is not connected = failover-monitor.sh-20120406 rc.firewall-20120302 rcfirewall.conf-20120302 allow-all fwdate.txt rc.firewall-20120704 rcfirewall.conf-20120704 allow-all-with-nat initial_rc.firewall-20120708 rc.firewall-20120708 rcfirewall.conf-20120708 append.txt make-virgin.sh rcfirewall.conf start-failover-monitor.sh etc rc.firewall rcfirewall.conf-20120201 failover-monitor.sh rc.firewall-20120201 rcfirewall.conf-20120210 [root@lme-fw2 firewall-scripts]# _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-users