Kaleb Keithley wrote on 04/02/2016 06:40: > > If you're a Debian Wheezy user please give the new packages a try. Hi Kaleb, Apologies for the delay in getting back to you. I tried the upgrade on one node last week and it failed but I hadn't had the time to try it again without the feeling of panic around my neck :-). So, I did the upgrade again on one node only but the node does not restart without error. Excerpt of etc-glusterfs-glusterd.vol.log follows. Other than the errors, the thing that sticks out to me is in the management volume definition where it says "option transport-type rdma" as we're not using rdma. This may of course be a red herring. I've now uninstalled the 3.7.6 packages and reinstalled 3.6.8. If you need any further information please do let me know. I can try the upgrade again if there are changes you'd like me to make. Thanks for your work on this so far. Ronny [2016-02-10 09:24:42.479807] W [glusterfsd.c:1211:cleanup_and_exit] (--> 0-: received signum (15), shutting down [2016-02-10 09:27:14.031377] I [MSGID: 100030] [glusterfsd.c:2318:main] 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.7.6 (args: /usr/sbin/glusterd -p /var/run/glusterd.pid) [2016-02-10 09:27:14.035512] I [MSGID: 106478] [glusterd.c:1350:init] 0-management: Maximum allowed open file descriptors set to 65536 [2016-02-10 09:27:14.035554] I [MSGID: 106479] [glusterd.c:1399:init] 0-management: Using /var/lib/glusterd as working directory [2016-02-10 09:27:14.040817] W [MSGID: 103071] [rdma.c:4592:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event channel creation failed [No such device] [2016-02-10 09:27:14.040848] W [MSGID: 103055] [rdma.c:4899:init] 0-rdma.management: Failed to initialize IB Device [2016-02-10 09:27:14.040860] W [rpc-transport.c:359:rpc_transport_load] 0-rpc-transport: 'rdma' initialization failed [2016-02-10 09:27:14.040921] W [rpcsvc.c:1597:rpcsvc_transport_create] 0-rpc-service: cannot create listener, initing the transport failed [2016-02-10 09:27:14.040937] E [MSGID: 106243] [glusterd.c:1623:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport [2016-02-10 09:27:15.725220] I [MSGID: 106513] [glusterd-store.c:2047:glusterd_restore_op_version] 0-glusterd: retrieved op-version: 1 [2016-02-10 09:27:15.900057] I [MSGID: 106498] [glusterd-handler.c:3579:glusterd_friend_add_from_peerinfo] 0-management: connect returned 0 [2016-02-10 09:27:15.900138] I [rpc-clnt.c:984:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2016-02-10 09:27:15.900735] W [socket.c:869:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 13, Invalid argument [2016-02-10 09:27:15.900749] E [socket.c:2965:socket_connect] 0-management: Failed to set keep-alive: Invalid argument [2016-02-10 09:27:15.900922] I [MSGID: 106194] [glusterd-store.c:3487:glusterd_store_retrieve_missed_snaps_list] 0-management: No missed snaps list. [2016-02-10 09:27:15.901746] I [MSGID: 106544] [glusterd.c:159:glusterd_uuid_init] 0-management: retrieved UUID: 79083345-b45a-466b-97f3-612ebfac7fe9 Final graph: +------------------------------------------------------------------------------+ 1: volume management 2: type mgmt/glusterd 3: option rpc-auth.auth-glusterfs on 4: option rpc-auth.auth-unix on 5: option rpc-auth.auth-null on 6: option rpc-auth-allow-insecure on 7: option transport.socket.listen-backlog 128 8: option ping-timeout 30 9: option transport.socket.read-fail-log off 10: option transport.socket.keepalive-interval 2 11: option transport.socket.keepalive-time 10 12: option transport-type rdma 13: option working-directory /var/lib/glusterd 14: end-volume 15: +------------------------------------------------------------------------------+ [2016-02-10 09:27:15.903840] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 2 [2016-02-10 09:27:15.903897] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 2 [2016-02-10 09:27:15.903945] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2016-02-10 09:27:15.906366] W [socket.c:588:__socket_rwv] 0-management: readv on 172.18.40.17:24007 failed (Connection reset by peer) [2016-02-10 09:27:15.906734] E [rpc-clnt.c:362:saved_frames_unwind] (--> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x1e7)[0x7f157cb46a57] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1be)[0x7f157c90d1be] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f157c90d2ce] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x88)[0x7f157c90ec58] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x1d0)[0x7f157c90f2a0] ))))) 0-management: forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) called at 2016-02-10 09:27:15.904156 (xid=0x1) [2016-02-10 09:27:15.906778] E [MSGID: 106167] [glusterd-handshake.c:2073:__glusterd_peer_dump_version_cbk] 0-management: Error through RPC layer, retry again later [2016-02-10 09:27:15.906969] E [rpc-clnt.c:362:saved_frames_unwind] (--> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x1e7)[0x7f157cb46a57] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1be)[0x7f157c90d1be] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f157c90d2ce] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x88)[0x7f157c90ec58] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x1d0)[0x7f157c90f2a0] ))))) 0-management: forced unwinding frame type(GF-DUMP) op(NULL(2)) called at 2016-02-10 09:27:15.904171 (xid=0x2) [2016-02-10 09:27:15.906992] W [rpc-clnt-ping.c:208:rpc_clnt_ping_cbk] 0-management: socket disconnected [2016-02-10 09:27:15.907014] I [MSGID: 106004] [glusterd-handler.c:5065:__glusterd_peer_rpc_notify] 0-management: Peer <metropolis.stor.graysofwestminster.co.uk> (<203011b9-b3da-408b-bb55-6fd088116f3c>), in state <Peer in Cluster>, has disconnected from glusterd. [2016-02-10 09:27:16.726940] I [MSGID: 106163] [glusterd-handshake.c:1193:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 1 [2016-02-10 09:27:16.754442] I [MSGID: 106490] [glusterd-handler.c:2539:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: 203011b9-b3da-408b-bb55-6fd088116f3c [2016-02-10 09:27:19.032999] W [socket.c:869:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 13, Invalid argument [2016-02-10 09:27:19.033025] E [socket.c:2965:socket_connect] 0-management: Failed to set keep-alive: Invalid argument [2016-02-10 09:27:19.033581] I [MSGID: 106493] [glusterd-handler.c:3780:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to metropolis.stor.graysofwestminster.co.uk (0), ret: 0 [2016-02-10 09:27:19.035777] W [socket.c:588:__socket_rwv] 0-management: readv on 172.18.40.17:24007 failed (Connection reset by peer) [2016-02-10 09:27:19.035931] I [rpc-clnt.c:984:rpc_clnt_connection_init] 0-nfs: setting frame-timeout to 600 [2016-02-10 09:27:19.036018] E [rpc-clnt.c:362:saved_frames_unwind] (--> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x1e7)[0x7f157cb46a57] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1be)[0x7f157c90d1be] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f157c90d2ce] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x88)[0x7f157c90ec58] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x1d0)[0x7f157c90f2a0] ))))) 0-management: forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) called at 2016-02-10 09:27:19.033489 (xid=0x3) -- Ronny Adsetts Technical Director Amazing Internet Ltd, London t: +44 20 8977 8943 w: www.amazinginternet.com Registered office: 85 Waldegrave Park, Twickenham, TW1 4TJ Registered in England. Company No. 4042957
Attachment:
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel