Gordan, can you post the complete client log from the time of mount? Avati On Sat, Jan 30, 2010 at 12:11 AM, Gordan Bobic <gordan@xxxxxxxxxx> wrote: > I'm seeing things like this in the logs, coupled with things locking up for > a while until the timeout is complete: > > [2010-01-29 18:29:01] E [client-protocol.c:415:client_ping_timer_expired] > home2: Server 10.2.0.10:6997 has not responded in the last 42 seconds, > disconnecting. > [2010-01-29 18:29:01] E [client-protocol.c:415:client_ping_timer_expired] > home2: Server 10.2.0.10:6997 has not responded in the last 42 seconds, > disconnecting. > > The thing is, I know for a fact that there is no network outage of any sort. > All the machines are on a local gigabit ethernet, and there is no > connectivity loss observed anywhere else. ssh sessions going to the machines > that are supposedly "not responding" remain alive and well, with no lag. > > The NICs in all the servers are a mix of Marvell (using the Marvell sk98lin > driver) and Realtek (using the Realtek r8168 driver) - none of which have > exhibited any other observable problems in use. > > In 42 seconds, TCP would have re-transmitted if the packets really have > gotten lost, so I'm not convinced it's packet loss (glfs uses TCP, right?). > If it's not packet loss, then that implies that glfs daemons get stuck > somewhere and either miss or ignore the packets in question. It smells like > a bug, and it's not a new one, either - I have observed this in 2.0.x, too. > It typically happens under heavy load (e.g. resyncing a volume to an empty > server, or doing "ls -laR" on a volume to make sure it's up to date on all > servers. In such cases, the network bandwidth used is nowhere near what the > network can handle, nor are the CPUs in the servers anywhere near being > maxed out - most of the time is spent waiting for the latencies (ping and > context switches) to catch up. So I don't think it's a load (CPU or network) > issue. > > Is there a way to help debug this further? > > Gordan > > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxx > http://lists.nongnu.org/mailman/listinfo/gluster-devel >