Re: GlusterFS AFR not failing over

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



No - this is a different problem. If the transport timeout was the problem, the access should return after < 60 seconds, should it not? In the case I'm seeing, something goes wrong and the only way to recover is to restart glusterfsd on the server(s) _AND_ glusterfs on the clients.

It's kind of hard to reproduce, as I only see it happening about once every week or so.

Gordan

On Sat, 7 Jun 2008, Krishna Srinivas wrote:

Gordon,

Is this the case of transport-timeout being high?

Krishna

On Sat, Jun 7, 2008 at 1:04 AM, Gordan Bobic <gordan@xxxxxxxxxx> wrote:
Hi,

I have /home mounted from GlusterFS with AFR, and if one of the servers
(secondary) goes away, I cannot log in. sshd tries to read ~/.ssh and bash
tries to read ~/.bashrc and this seems to fail - or at least take a very
long time to time out and try the remaining server (which verifiably works).

I get this sort of thing in the logs:

E [tcp-client.c:190:tcp_connect] home2: non-blocking connect() returned: 110
(Connection timed out)
E [client-protocol.c:4423:client_lookup_cbk] home2: no proper reply from
server, returning ENOTCONN
C [client-protocol.c:212:call_bail] home2: bailing transport

where home2 is the name of the GlusterFS export on the secondary.

Is this a known issue or have I managed to trip another error case?

Gordan


_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxx
http://lists.nongnu.org/mailman/listinfo/gluster-devel






[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux