I don't have volfiles, they are not on our machines as I said previously we don't have impact on gluster servers.
I saw some graph that looks similiar to volume file on logs. I will paste it here but we don't really have any impact on that. We are just using client to connect to gluster servers, we are not in control of.
1: volume drslk-prod-client-0
2: type protocol/client
3: option ping-timeout 20
4: option remote-host brick13.gluster.iadm
5: option remote-subvolume /GLUSTERFS/drslk-prod
6: option transport-type socket
7: option frame-timeout 60
8: option send-gids true
9: end-volume
10:
11: volume drslk-prod-client-1
12: type protocol/client
13: option ping-timeout 20
14: option remote-host brick14.gluster.iadm
15: option remote-subvolume /GLUSTERFS/drslk-prod
16: option transport-type socket
17: option frame-timeout 60
18: option send-gids true
19: end-volume
20:
21: volume drslk-prod-client-2
22: type protocol/client
23: option ping-timeout 20
24: option remote-host brick15.gluster.iadm
25: option remote-subvolume /GLUSTERFS/drslk-prod
26: option transport-type socket
27: option frame-timeout 60
28: option send-gids true
29: end-volume
30:
31: volume drslk-prod-replicate-0
32: type cluster/replicate
33: option read-hash-mode 2
34: option data-self-heal-window-size 128
35: option quorum-type auto
36: subvolumes drslk-prod-client-0 drslk-prod-client-1 drslk-prod-client-2
37: end-volume
38:
39: volume drslk-prod-client-3
40: type protocol/client
41: option ping-timeout 20
42: option remote-host brick16.gluster.iadm
43: option remote-subvolume /GLUSTERFS/drslk-prod
44: option transport-type socket
45: option frame-timeout 60
46: option send-gids true
47: end-volume
48:
49: volume drslk-prod-client-4
50: type protocol/client
51: option ping-timeout 20
52: option remote-host brick17.gluster.iadm
53: option remote-subvolume /GLUSTERFS/drslk-prod
54: option transport-type socket
55: option frame-timeout 60
56: option send-gids true
57: end-volume
58:
59: volume drslk-prod-client-5
60: type protocol/client
61: option ping-timeout 20
62: option remote-host brick18.gluster.iadm
63: option remote-subvolume /GLUSTERFS/drslk-prod
64: option transport-type socket
65: option frame-timeout 60
66: option send-gids true
67: end-volume
68:
69: volume drslk-prod-replicate-1
70: type cluster/replicate
71: option read-hash-mode 2
72: option data-self-heal-window-size 128
73: option quorum-type auto
74: subvolumes drslk-prod-client-3 drslk-prod-client-4 drslk-prod-client-5
75: end-volume
76:
77: volume drslk-prod-client-6
78: type protocol/client
79: option ping-timeout 20
80: option remote-host brick19.gluster.iadm
81: option remote-subvolume /GLUSTERFS/drslk-prod
82: option transport-type socket
83: option frame-timeout 60
84: option send-gids true
85: end-volume
86:
87: volume drslk-prod-client-7
88: type protocol/client
89: option ping-timeout 20
90: option remote-host brick20.gluster.iadm
91: option remote-subvolume /GLUSTERFS/drslk-prod
92: option transport-type socket
93: option frame-timeout 60
94: option send-gids true
95: end-volume
96:
97: volume drslk-prod-client-8
98: type protocol/client
99: option ping-timeout 20
100: option remote-host brick21.gluster.iadm
101: option remote-subvolume /GLUSTERFS/drslk-prod
102: option transport-type socket
103: option frame-timeout 60
104: option send-gids true
105: end-volume
106:
107: volume drslk-prod-replicate-2
108: type cluster/replicate
109: option read-hash-mode 2
110: option data-self-heal-window-size 128
111: option quorum-type auto
112: subvolumes drslk-prod-client-6 drslk-prod-client-7 drslk-prod-client-8
113: end-volume
114:
115: volume drslk-prod-client-9
116: type protocol/client
117: option ping-timeout 20
118: option remote-host brick22.gluster.iadm
119: option remote-subvolume /GLUSTERFS/drslk-prod
120: option transport-type socket
121: option frame-timeout 60
122: option send-gids true
123: end-volume
124:
125: volume drslk-prod-client-10
126: type protocol/client
127: option ping-timeout 20
128: option remote-host brick23.gluster.iadm
129: option remote-subvolume /GLUSTERFS/drslk-prod
130: option transport-type socket
131: option frame-timeout 60
132: option send-gids true
133: end-volume
134:
135: volume drslk-prod-client-11
136: type protocol/client
137: option ping-timeout 20
138: option remote-host brick24.gluster.iadm
139: option remote-subvolume /GLUSTERFS/drslk-prod
140: option transport-type socket
141: option frame-timeout 60
142: option send-gids true
143: end-volume
144:
145: volume drslk-prod-replicate-3
146: type cluster/replicate
147: option read-hash-mode 2
148: option data-self-heal-window-size 128
149: option quorum-type auto
150: subvolumes drslk-prod-client-9 drslk-prod-client-10 drslk-prod-client-11
151: end-volume
152:
153: volume drslk-prod-dht
154: type cluster/distribute
155: option min-free-disk 10%
156: option readdir-optimize on
157: subvolumes drslk-prod-replicate-0 drslk-prod-replicate-1 drslk-prod-replicate-2 drslk-prod-replicate-3
158: end-volume
159:
160: volume drslk-prod-write-behind
161: type performance/write-behind
162: option cache-size 1MB
163: subvolumes drslk-prod-dht
164: end-volume
165:
166: volume drslk-prod-read-ahead
167: type performance/read-ahead
168: subvolumes drslk-prod-write-behind
169: end-volume
170:
171: volume drslk-prod-readdir-ahead
172: type performance/readdir-ahead
173: subvolumes drslk-prod-read-ahead
174: end-volume
175:
176: volume drslk-prod-io-cache
177: type performance/io-cache
178: option cache-timeout 60
179: option cache-size 512MB
180: subvolumes drslk-prod-readdir-ahead
181: end-volume
182:
183: volume drslk-prod-quick-read
184: type performance/quick-read
185: option cache-size 512MB
186: subvolumes drslk-prod-io-cache
187: end-volume
188:
189: volume drslk-prod-md-cache
190: type performance/md-cache
191: subvolumes drslk-prod-quick-read
192: end-volume
193:
194: volume drslk-prod
195: type debug/io-stats
196: option latency-measurement off
197: option count-fop-hits off
198: subvolumes drslk-prod-md-cache
199: end-volume
200:
201: volume meta-autoload
202: type meta
203: subvolumes drslk-prod
204: end-volume
205:
Btw, do you think that different versions of gluster client and gluster server could be an issue here?
2015-03-08 1:29 GMT+01:00 Vijay Bellur <vbellur@xxxxxxxxxx>:
On 03/07/2015 06:20 PM, Przemysław Mroczek wrote:
<http://10.10.11.23:24007/> failed (Connection refused)Hi guys,
We have rails app, which is using gluster for our distributed file
system. The glusters servers are hosted independently as part of deal
with other, we don't have any impact on them, we are connected o them by
using gluster native client.
We tried to resolve this issue using help from the admins of the company
that is hosting our gluster servers, but they say that's the client
issue and we ran out of ideas how that's possible if we are not doing
anything special here.
Information about independent gluster servers:
-version: 3.6.0.42.1
- They are using red hat
-They are enterprise so the are always using older versions
Our servers:
System version: Ubuntu 14.04
Our gluster client version: 3.6.2
The exact problem is that it often happens(couple times a week) that
errors in gluster causes proceses to become zombies. It happens with our
application server(unicorn), nginx and our crawling script that is run
as daemon.
Our fstab file:
10.10.11.17:/drslk-prod /mnt/storage glusterfs
defaults,_netdev,nobootwait,fetch-attempts=10 0 0
10.10.11.17:/drslk-backup /mnt/backup glusterfs
defaults,_netdev,nobootwait,fetch-attempts=10 0 0
Logs from gluster:
2015-02-18 12:36:12.375695] E [rpc-clnt.c:362:saved_frames_unwind] (-->
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x186)[0x7fb41ddeada6]
(-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7fb41d
bc1c7e] (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fb41dbc1d8e]
(-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x82)[0x7fb41dbc3602]
(--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc
_clnt_notify+0x48)[0x7fb41dbc3d98] ))))) 0-drslk-prod-client-10: forced
unwinding frame type(GlusterFS 3.3) op(LOOKUP(27)) called at 2015-02-18
12:36:12.361489 (xid=0x5d475da)
[2015-02-18 12:36:12.375765] W
[client-rpc-fops.c:2766:client3_3_lookup_cbk] 0-drslk-prod-client-10:
remote operation failed: Transport endpoint is not connected. Path:
/system/posts/00/00/71/77/59.jpg (2ad81c2b-a141-478d-9dd4-253345edbce
b)
[2015-02-18 12:36:12.376288] E [rpc-clnt.c:362:saved_frames_unwind] (-->
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x186)[0x7fb41ddeada6]
(-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7fb41d
bc1c7e] (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fb41dbc1d8e]
(-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x82)[0x7fb41dbc3602]
(--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc
_clnt_notify+0x48)[0x7fb41dbc3d98] ))))) 0-drslk-prod-client-10: forced
unwinding frame type(GlusterFS 3.3) op(LOOKUP(27)) called at 2015-02-18
12:36:12.361858 (xid=0x5d475db)
[2015-02-18 12:36:12.376355] W
[client-rpc-fops.c:2766:client3_3_lookup_cbk] 0-drslk-prod-client-10:
remote operation failed: Transport endpoint is not connected. Path:
/system/posts/00/00/08 (f5c33a99-719e-4ea2-ad1f-33b893af103d)
[2015-02-18 12:36:12.376711] I [socket.c:3292:socket_submit_request]
0-drslk-prod-client-10: not connected (priv->connected = 0)
[2015-02-18 12:36:12.376749] W [rpc-clnt.c:1562:rpc_clnt_submit]
0-drslk-prod-client-10: failed to submit rpc-request (XID: 0x5d475dc
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) to rpc-transport
(drslk-prod-client-10)
[2015-02-18 12:36:12.376814] W
[client-rpc-fops.c:2766:client3_3_lookup_cbk] 0-drslk-prod-client-10:
remote operation failed: Transport endpoint is not connected. Path:
(null) (00000000-0000-0000-0000-000000000000)
[2015-02-18 12:36:12.376829] I [client.c:2215:client_rpc_notify]
0-drslk-prod-client-10: disconnected from drslk-prod-client-10. Client
process will keep trying to connect to glusterd until brick's port is
available
[2015-02-18 12:36:12.376834] W [rpc-clnt.c:1562:rpc_clnt_submit]
0-drslk-prod-client-10: failed to submit rpc-request (XID: 0x5d475dd
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) to rpc-transport
(drslk-prod-client-10)
[2015-02-18 12:36:12.376906] W
[client-rpc-fops.c:2766:client3_3_lookup_cbk] 0-drslk-prod-client-10:
remote operation failed: Transport endpoint is not connected. Path:
(null) (00000000-0000-0000-0000-000000000000)
[2015-02-18 12:36:12.376931] E [socket.c:2267:socket_connect_finish]
0-drslk-prod-client-10: connection to 10.10.11.23:24007
[2015-02-18 12:36:12.379296] W
[client-rpc-fops.c:2766:client3_3_lookup_cbk] 0-drslk-prod-client-10:
remote operation failed: Transport endpoint is not connected. Path:
(null) (00000000-0000-0000-0000-000000000000)
[2015-02-18 12:36:12.379700] W
[client-rpc-fops.c:2766:client3_3_lookup_cbk] 0-drslk-prod-client-10:
remote operation failed: Transport endpoint is not connected. Path:
(null) (00000000-0000-0000-0000-000000000000)
[2015-02-18 13:10:52.759736] E
[client-handshake.c:1496:client_query_portmap_cbk]
0-drslk-prod-client-10: failed to get the port number for remote
subvolume. Please run 'gluster volume status' on server to see if brick
process is running.
[2015-02-18 13:10:52.759796] I [client.c:2215:client_rpc_notify]
0-drslk-prod-client-10: disconnected from drslk-prod-client-10. Client
process will keep trying to connect to glusterd until brick's port is
available
[2015-02-18 13:11:02.897307] I [rpc-clnt.c:1761:rpc_clnt_reconfig]
0-drslk-prod-client-10: changing port to 49349 (from 0)
[2015-02-18 13:11:02.898097] I
[client-handshake.c:1413:select_server_supported_programs]
0-drslk-prod-client-10: Using Program GlusterFS 3.3, Num (1298437),
Version (330)
[2015-02-18 13:11:02.898446] I
[client-handshake.c:1200:client_setvolume_cbk] 0-drslk-prod-client-10:
Connected to drslk-prod-client-10, attached to remote volume
'/GLUSTERFS/drslk-prod'.
[2015-02-18 13:11:02.898460] I
[client-handshake.c:1210:client_setvolume_cbk] 0-drslk-prod-client-10:
Server and Client lk-version numbers are not same, reopening the fds
Can you provide the gluster volume configuration details?
It does look like frame-timeout for the volume has been set to 60. Is there any specific reason? Normally altering the frame-timeout is not recommended.
-Vijay
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users