Hello - After some experimentation, I have discovered what is causing my client<-> server connections to hang periodically until the server is restarted. The clients are running on stock CentOS which has an iptables configuration that only allows traffic to flow on tcp/ip sessions that have been initiated while that instance of iptables is running (--state ESTABLISHED,RELATED). So, when iptables was restarted, the return traffic from the glusterfs server was not reaching the clients even though connectivity was restored between the two and new tcp/ip sessions worked no problem. What I don't understand is why the clients think the broken tcp/ip session is still valid and does not try to reach the server with a new session. At least adding a rule to iptables to allow all traffic (not just traffic related to established connections) across the servers on that port solves my hanging problem across iptables restarts, but it worries me about real life situations when the server might disappear (ie unplugged cable, data center outage, or other lower layer outages). In the case of these kinds of network outages, wouldn't a similar situation be created where the server would not be able to reach the clients. Why wouldn't the clients create a new tcp/ip connection to the server when they recognize a timeout for one connection and the server is still responding to new connections? Anyway, now that this "stability" issue is solved, I look forward to experimenting with many wild and wacky combinations of translators. Thanks for your patience! :august On 7/31/07, August R. Wohlt <glusterfs@xxxxxxxxxxx> wrote: > > Hi, memory (3g/4g used) and cpu are normal (load of ~1.5) when this > happens and when i run the server in gdb, it is not captured. I have to > suspend it to get a backtrace. Similarly, when run outside of gdb, it > doesn't crash. the server just ends up not responding to either of the > clients. > > >