No - this is a different problem. If the transport timeout was the
problem, the access should return after < 60 seconds, should it not? In
the case I'm seeing, something goes wrong and the only way to recover is
to restart glusterfsd on the server(s) _AND_ glusterfs on the clients.
It's kind of hard to reproduce, as I only see it happening about once
every week or so.
Gordan
On Sat, 7 Jun 2008, Krishna Srinivas wrote:
Gordon,
Is this the case of transport-timeout being high?
Krishna
On Sat, Jun 7, 2008 at 1:04 AM, Gordan Bobic <gordan@xxxxxxxxxx> wrote:
Hi,
I have /home mounted from GlusterFS with AFR, and if one of the servers
(secondary) goes away, I cannot log in. sshd tries to read ~/.ssh and bash
tries to read ~/.bashrc and this seems to fail - or at least take a very
long time to time out and try the remaining server (which verifiably works).
I get this sort of thing in the logs:
E [tcp-client.c:190:tcp_connect] home2: non-blocking connect() returned: 110
(Connection timed out)
E [client-protocol.c:4423:client_lookup_cbk] home2: no proper reply from
server, returning ENOTCONN
C [client-protocol.c:212:call_bail] home2: bailing transport
where home2 is the name of the GlusterFS export on the secondary.
Is this a known issue or have I managed to trip another error case?
Gordan
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxx
http://lists.nongnu.org/mailman/listinfo/gluster-devel