The versions were:
gluster client: 3.6.2
gluster server: 3.6.0
2015-03-08 18:17 GMT+01:00 Vijay Bellur <vbellur@xxxxxxxxxx>:
On 03/08/2015 09:36 AM, Przemysław Mroczek wrote:
I don't have volfiles, they are not on our machines as I said previously
we don't have impact on gluster servers.
I saw some graph that looks similiar to volume file on logs. I will
paste it here but we don't really have any impact on that. We are just
using client to connect to gluster servers, we are not in control of.
I would recommend to not alter the default for frame timeout.
Btw, do you think that different versions of gluster client and gluster
server could be an issue here?
It can potentially be. What versions are you using on the servers and the client?
-Vijay
2015-03-08 1:29 GMT+01:00 Vijay Bellur <vbellur@xxxxxxxxxx
<mailto:vbellur@xxxxxxxxxx>>:defaults,_netdev,nobootwait,__fetch-attempts=10 0 0
On 03/07/2015 06:20 PM, Przemysław Mroczek wrote:
Hi guys,
We have rails app, which is using gluster for our distributed file
system. The glusters servers are hosted independently as part of
deal
with other, we don't have any impact on them, we are connected o
them by
using gluster native client.
We tried to resolve this issue using help from the admins of the
company
that is hosting our gluster servers, but they say that's the client
issue and we ran out of ideas how that's possible if we are not
doing
anything special here.
Information about independent gluster servers:
-version: 3.6.0.42.1
- They are using red hat
-They are enterprise so the are always using older versions
Our servers:
System version: Ubuntu 14.04
Our gluster client version: 3.6.2
The exact problem is that it often happens(couple times a week) that
errors in gluster causes proceses to become zombies. It happens
with our
application server(unicorn), nginx and our crawling script that
is run
as daemon.
Our fstab file:
10.10.11.17:/drslk-prod /mnt/storage glusterfs
10.10.11.17:/drslk-backup /mnt/backup glusterfs
defaults,_netdev,nobootwait,__fetch-attempts=10 0 0
Logs from gluster:
2015-02-18 12:36:12.375695] E
[rpc-clnt.c:362:saved_frames___unwind] (-->
/usr/lib/x86_64-linux-gnu/__libglusterfs.so.0(_gf_log___callingfn+0x186)[__0x7fb41ddeada6]
(-->
/usr/lib/x86_64-linux-gnu/__libgfrpc.so.0(saved_frames___unwind+0x1de)[0x7fb41d
bc1c7e] (-->
/usr/lib/x86_64-linux-gnu/__libgfrpc.so.0(saved_frames___destroy+0xe)[0x7fb41dbc1d8e]
(-->
/usr/lib/x86_64-linux-gnu/__libgfrpc.so.0(rpc_clnt___connection_cleanup+0x82)[__0x7fb41dbc3602]
(--> /usr/lib/x86_64-linux-gnu/__libgfrpc.so.0(rpc
_clnt_notify+0x48)[__0x7fb41dbc3d98] )))))
0-drslk-prod-client-10: forced
unwinding frame type(GlusterFS 3.3) op(LOOKUP(27)) called at
2015-02-18
12:36:12.361489 (xid=0x5d475da)
[2015-02-18 12:36:12.375765] W
[client-rpc-fops.c:2766:__client3_3_lookup_cbk]
0-drslk-prod-client-10:
remote operation failed: Transport endpoint is not connected. Path:
/system/posts/00/00/71/77/59.__jpg
(2ad81c2b-a141-478d-9dd4-__253345edbce
b)
[2015-02-18 12:36:12.376288] E
[rpc-clnt.c:362:saved_frames___unwind] (-->
/usr/lib/x86_64-linux-gnu/__libglusterfs.so.0(_gf_log___callingfn+0x186)[__0x7fb41ddeada6]
(-->
/usr/lib/x86_64-linux-gnu/__libgfrpc.so.0(saved_frames___unwind+0x1de)[0x7fb41d
bc1c7e] (-->
/usr/lib/x86_64-linux-gnu/__libgfrpc.so.0(saved_frames___destroy+0xe)[0x7fb41dbc1d8e]
(-->
/usr/lib/x86_64-linux-gnu/__libgfrpc.so.0(rpc_clnt___connection_cleanup+0x82)[__0x7fb41dbc3602]
(--> /usr/lib/x86_64-linux-gnu/__libgfrpc.so.0(rpc
_clnt_notify+0x48)[__0x7fb41dbc3d98] )))))
0-drslk-prod-client-10: forced
unwinding frame type(GlusterFS 3.3) op(LOOKUP(27)) called at
2015-02-18
12:36:12.361858 (xid=0x5d475db)
[2015-02-18 12:36:12.376355] W
[client-rpc-fops.c:2766:__client3_3_lookup_cbk]
0-drslk-prod-client-10:
remote operation failed: Transport endpoint is not connected. Path:
/system/posts/00/00/08 (f5c33a99-719e-4ea2-ad1f-__33b893af103d)
[2015-02-18 12:36:12.376711] I
[socket.c:3292:socket_submit___request]
0-drslk-prod-client-10: not connected (priv->connected = 0)
[2015-02-18 12:36:12.376749] W [rpc-clnt.c:1562:rpc_clnt___submit]
0-drslk-prod-client-10: failed to submit rpc-request (XID: 0x5d475dc
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) to rpc-transport
(drslk-prod-client-10)
[2015-02-18 12:36:12.376814] W
[client-rpc-fops.c:2766:__client3_3_lookup_cbk]
0-drslk-prod-client-10:
remote operation failed: Transport endpoint is not connected. Path:
(null) (00000000-0000-0000-0000-__000000000000)
[2015-02-18 12:36:12.376829] I [client.c:2215:client_rpc___notify]
0-drslk-prod-client-10: disconnected from drslk-prod-client-10.
Client
process will keep trying to connect to glusterd until brick's
port is
available
[2015-02-18 12:36:12.376834] W [rpc-clnt.c:1562:rpc_clnt___submit]
0-drslk-prod-client-10: failed to submit rpc-request (XID: 0x5d475dd
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) to rpc-transport
(drslk-prod-client-10)
[2015-02-18 12:36:12.376906] W
[client-rpc-fops.c:2766:__client3_3_lookup_cbk]
0-drslk-prod-client-10:
remote operation failed: Transport endpoint is not connected. Path:
(null) (00000000-0000-0000-0000-__000000000000)
[2015-02-18 12:36:12.376931] E
[socket.c:2267:socket_connect___finish]
0-drslk-prod-client-10: connection to 10.10.11.23:24007
<http://10.10.11.23:24007>
<http://10.10.11.23:24007/> failed (Connection refused)
[2015-02-18 12:36:12.379296] W
[client-rpc-fops.c:2766:__client3_3_lookup_cbk]
0-drslk-prod-client-10:
remote operation failed: Transport endpoint is not connected. Path:
(null) (00000000-0000-0000-0000-__000000000000)
[2015-02-18 12:36:12.379700] W
[client-rpc-fops.c:2766:__client3_3_lookup_cbk]
0-drslk-prod-client-10:
remote operation failed: Transport endpoint is not connected. Path:
(null) (00000000-0000-0000-0000-__000000000000)
[2015-02-18 13:10:52.759736] E
[client-handshake.c:1496:__client_query_portmap_cbk]
0-drslk-prod-client-10: failed to get the port number for remote
subvolume. Please run 'gluster volume status' on server to see
if brick
process is running.
[2015-02-18 13:10:52.759796] I [client.c:2215:client_rpc___notify]
0-drslk-prod-client-10: disconnected from drslk-prod-client-10.
Client
process will keep trying to connect to glusterd until brick's
port is
available
[2015-02-18 13:11:02.897307] I [rpc-clnt.c:1761:rpc_clnt___reconfig]
0-drslk-prod-client-10: changing port to 49349 (from 0)
[2015-02-18 13:11:02.898097] I
[client-handshake.c:1413:__select_server_supported___programs]
0-drslk-prod-client-10: Using Program GlusterFS 3.3, Num (1298437),
Version (330)
[2015-02-18 13:11:02.898446] I
[client-handshake.c:1200:__client_setvolume_cbk]
0-drslk-prod-client-10:
Connected to drslk-prod-client-10, attached to remote volume
'/GLUSTERFS/drslk-prod'.
[2015-02-18 13:11:02.898460] I
[client-handshake.c:1210:__client_setvolume_cbk]
0-drslk-prod-client-10:
Server and Client lk-version numbers are not same, reopening the fds
Can you provide the gluster volume configuration details?
It does look like frame-timeout for the volume has been set to 60.
Is there any specific reason? Normally altering the frame-timeout is
not recommended.
-Vijay
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users