En/na Jordi Moles Blanco ha escrit:
En/na Anand Avati ha escrit:
During several days i didn't pay attention to the gluster logs, as
everything worked fine. However, today i decided i was moving a file
sized
500MB and the mount point got stale, i couln't access the data from
that
particular client. The gluster itself didn't seem to be affected, nodes
didn't report any problem at all in the log files and other clients
kept the
mount point without any problem.
Then i decided to have a look at the log files:
*************
2009-01-30 11:00:41 W [client-protocol.c:332:client_protocol_xfer]
espai1:
not connected at the moment to submit frame type(1) op(15)
2009-01-30 11:00:41 E [client-protocol.c:3891:client_statfs_cbk]
espai1: no
proper reply from server, returning ENOTCONN
2009-01-30 11:00:41 E [tcp-client.c:190:tcp_connect] espai5:
non-blocking
connect() returned: 111 (Connection refused)
2009-01-30 11:00:43 W [client-protocol.c:332:client_protocol_xfer]
espai2:
not connected at the moment to submit frame type(1) op(15)
2009-01-30 11:00:43 E [client-protocol.c:3891:client_statfs_cbk]
espai2: no
proper reply from server, returning ENOTCONN
2009-01-30 11:00:43 E [tcp-client.c:190:tcp_connect] espai6:
non-blocking
connect() returned: 111 (Connection refused)
*************
A connection refused error is got when a daemon is not running, or if
there is a packet filter resetting connections. If GlusterFS daemon is
running and other clients are able to access normally, please make
sure there is no packet filtering of some sort happening. You can try
flushing all firewall rules if there were any. Based on the
description you give, it seems to be an issue outside GlusterFS
Avati
Hi,
thanks for the explanation about the origin of the error message.
Well... it doesn't look like there is a problem with the network on
which glusterfs runs, it would have appeared in the rrd graphs i'm
keeping for net traffic, but i'll carry a whole test to see if there's
the slightest problem which could generate this message.
Thanks.
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxx
http://lists.nongnu.org/mailman/listinfo/gluster-devel
Hi,
since the last time we were in contact I've been trying to track down
where the problem is. I've been monitoring almost every possible thing
related to network traffic, and... eventually.... i found out what the
problem is by chance!!
It turns out that when in a client-server mounting gluster i run "df
-h", i get this:
***********
2009-02-19 17:15:43 E [tcp-client.c:190:tcp_connect] espai1:
non-blocking connect() returned: 111 (Connection refused)
2009-02-19 17:15:43 W [client-protocol.c:332:client_protocol_xfer]
espai1: not connected at the moment to submit frame type(1) op(15)
2009-02-19 17:15:43 E [client-protocol.c:3891:client_statfs_cbk] espai1:
no proper reply from server, returning ENOTCONN
2009-02-19 17:15:43 E [tcp-client.c:190:tcp_connect] espai5:
non-blocking connect() returned: 111 (Connection refused)
2009-02-19 17:15:43 W [client-protocol.c:332:client_protocol_xfer]
espai5: not connected at the moment to submit frame type(1) op(15)
2009-02-19 17:15:43 E [client-protocol.c:3891:client_statfs_cbk] espai5:
no proper reply from server, returning ENOTCONN
2009-02-19 17:15:43 E [tcp-client.c:190:tcp_connect] espai2:
non-blocking connect() returned: 111 (Connection refused)
2009-02-19 17:15:43 W [client-protocol.c:332:client_protocol_xfer]
espai2: not connected at the moment to submit frame type(1) op(15)
2009-02-19 17:15:43 E [client-protocol.c:3891:client_statfs_cbk] espai2:
no proper reply from server, returning ENOTCONN
2009-02-19 17:15:43 E [tcp-client.c:190:tcp_connect] espai6:
non-blocking connect() returned: 111 (Connection refused)
2009-02-19 17:15:43 W [client-protocol.c:332:client_protocol_xfer]
espai6: not connected at the moment to submit frame type(1) op(15)
2009-02-19 17:15:43 E [client-protocol.c:3891:client_statfs_cbk] espai6:
no proper reply from server, returning ENOTCONN
************
so... the reason why it is appearing so often is that i've got munin
monitoring this gluster environment, and it performs a "df" command to
check the disk space of all the servers, including, of course, the
gluster mount point. When this happens... the error log shown above
these lines is reported and eventually.... the mount point in that
server fails. No data is lost, but i have to remount glusterfs as it
becomes stale and data is not accessible.
is this a normal behaviour?
i could stop munin from running "df" every 5 minutes... but still... is
there any problem in my setup or is this what gluster is supposed to do?
Thanks.