On 11/25/2014 07:08 PM, Scott Merrill wrote: > On 11/24/14, 11:56 PM, Atin Mukherjee wrote: >> Can you please find/point out the first instance of the command and its >> associated glusterd log which failed to acquire the cluster wide lock. > > > Can you help me identify what I should be looking for in the logs? grep for first instance of "locking failed" in glusterd log in the server where the command failed. > > > I restarted the glusterd service and see the following on server gluster2: > > [2014-11-25 13:34:50.552695] W [glusterfsd.c:1194:cleanup_and_exit] (--> > 0-: received signum (15), shutting down > [2014-11-25 13:34:50.569445] I [MSGID: 100030] [glusterfsd.c:2018:main] > 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.6.1 > (args: /usr/sbin/glusterd -p /var/run/glusterd.pid) > [2014-11-25 13:34:50.576951] I [glusterd.c:1214:init] 0-management: > Maximum allowed open file descriptors set to 65536 > [2014-11-25 13:34:50.577008] I [glusterd.c:1259:init] 0-management: > Using /var/lib/glusterd as working directory > [2014-11-25 13:34:50.581436] E [rpc-transport.c:266:rpc_transport_load] > 0-rpc-transport: /usr/lib64/glusterfs/3.6.1/rpc-transport/rdma.so: > cannot open shared object file: No such file or directory > [2014-11-25 13:34:50.581469] W [rpc-transport.c:270:rpc_transport_load] > 0-rpc-transport: volume 'rdma.management': transport-type 'rdma' is not > valid or not found on this machine > [2014-11-25 13:34:50.581486] W [rpcsvc.c:1524:rpcsvc_transport_create] > 0-rpc-service: cannot create listener, initing the transport failed > [2014-11-25 13:34:50.583680] I > [glusterd.c:413:glusterd_check_gsync_present] 0-glusterd: > geo-replication module not installed in the system > [2014-11-25 13:34:50.584503] I > [glusterd-store.c:2043:glusterd_restore_op_version] 0-glusterd: > retrieved op-version: 30501 > [2014-11-25 13:34:51.074904] I > [glusterd-handler.c:3146:glusterd_friend_add_from_peerinfo] > 0-management: connect returned 0 > [2014-11-25 13:34:51.075024] I > [glusterd-handler.c:3146:glusterd_friend_add_from_peerinfo] > 0-management: connect returned 0 > [2014-11-25 13:34:51.075107] I [rpc-clnt.c:969:rpc_clnt_connection_init] > 0-management: setting frame-timeout to 600 > [2014-11-25 13:34:51.082647] I [rpc-clnt.c:969:rpc_clnt_connection_init] > 0-management: setting frame-timeout to 600 > [2014-11-25 13:34:51.089136] I > [glusterd-store.c:3501:glusterd_store_retrieve_missed_snaps_list] > 0-management: No missed snaps list. > [2014-11-25 13:34:51.094388] I [glusterd.c:146:glusterd_uuid_init] > 0-management: retrieved UUID: 23989211-4f0d-4087-b9c5-bc82295b2c38 > Final graph: > +------------------------------------------------------------------------------+ > 1: volume management > 2: type mgmt/glusterd > 3: option rpc-auth.auth-glusterfs on > 4: option rpc-auth.auth-unix on > 5: option rpc-auth.auth-null on > 6: option transport.socket.listen-backlog 128 > 7: option ping-timeout 30 > 8: option transport.socket.read-fail-log off > 9: option transport.socket.keepalive-interval 2 > 10: option transport.socket.keepalive-time 10 > 11: option transport-type rdma > 12: option working-directory /var/lib/glusterd > 13: end-volume > 14: > +------------------------------------------------------------------------------+ > [2014-11-25 13:34:57.330091] W [socket.c:611:__socket_rwv] 0-management: > readv on 192.168.30.107:24007 failed (No data available) > [2014-11-25 13:34:57.330583] E [rpc-clnt.c:362:saved_frames_unwind] (--> > /lib64/libglusterfs.so.0(_gf_log_callingfn+0x186)[0x7f7065b04396] (--> > /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f70658d6fce] (--> > /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f70658d70de] (--> > /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x82)[0x7f70658d8a42] > (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x48)[0x7f70658d91f8] ))))) > 0-management: forced unwinding frame type(Peer mgmt) op(--(2)) called at > 2014-11-25 13:34:51.150255 (xid=0x5) > [2014-11-25 13:34:57.330641] I [MSGID: 106004] > [glusterd-handler.c:4365:__glusterd_peer_rpc_notify] 0-management: Peer > bb61111b-b048-4dc9-b54d-12a0cc2dd8a9, in Peer in Cluster state, has > disconnected from glusterd. > [2014-11-25 13:34:57.338507] I > [glusterd-rpc-ops.c:436:__glusterd_friend_add_cbk] 0-glusterd: Received > ACC from uuid: 60b2251f-2d69-41a6-91da-fd3f14c5a1e6, host: > gluster3.innova.local, port: 0 > [2014-11-25 13:34:57.361000] I [rpc-clnt.c:969:rpc_clnt_connection_init] > 0-management: setting frame-timeout to 600 > [2014-11-25 13:34:57.361306] I [rpc-clnt.c:969:rpc_clnt_connection_init] > 0-management: setting frame-timeout to 600 > [2014-11-25 13:34:57.361535] I [rpc-clnt.c:969:rpc_clnt_connection_init] > 0-management: setting frame-timeout to 600 > [2014-11-25 13:34:57.361781] I [rpc-clnt.c:969:rpc_clnt_connection_init] > 0-management: setting frame-timeout to 600 > [2014-11-25 13:34:57.362014] I [rpc-clnt.c:969:rpc_clnt_connection_init] > 0-management: setting frame-timeout to 600 > [2014-11-25 13:34:57.362241] I [rpc-clnt.c:969:rpc_clnt_connection_init] > 0-management: setting frame-timeout to 600 > [2014-11-25 13:34:57.362458] I [rpc-clnt.c:969:rpc_clnt_connection_init] > 0-management: setting frame-timeout to 600 > [2014-11-25 13:34:57.369468] I [rpc-clnt.c:969:rpc_clnt_connection_init] > 0-management: setting frame-timeout to 600 > [2014-11-25 13:34:58.379351] I [rpc-clnt.c:969:rpc_clnt_connection_init] > 0-management: setting frame-timeout to 600 > [2014-11-25 13:34:58.379587] W [socket.c:2992:socket_connect] > 0-management: Ignore failed connection attempt on , (No such file or > directory) > [2014-11-25 13:34:58.413871] W [socket.c:611:__socket_rwv] 0-management: > readv on /var/run/9bdd01b8b5f546ce04b25ce7d68e3ace.socket failed > (Invalid argument) > [2014-11-25 13:34:58.413933] I [MSGID: 106006] > [glusterd-handler.c:4257:__glusterd_nodesvc_rpc_notify] 0-management: > nfs has disconnected from glusterd. > [2014-11-25 13:34:58.414015] W [socket.c:611:__socket_rwv] 0-management: > readv on /var/run/2df1fa6567b898d24c7d5f4c98b073a2.socket failed > (Invalid argument) > [2014-11-25 13:34:58.414047] I [MSGID: 106006] > [glusterd-handler.c:4257:__glusterd_nodesvc_rpc_notify] 0-management: > glustershd has disconnected from glusterd. > [2014-11-25 13:34:58.414832] I > [glusterd-handshake.c:1061:__glusterd_mgmt_hndsk_versions_ack] > 0-management: using the op-version 30501 > [2014-11-25 13:34:58.436428] I > [glusterd-handler.c:2216:__glusterd_handle_incoming_friend_req] > 0-glusterd: Received probe from uuid: bb61111b-b048-4dc9-b54d-12a0cc2dd8a9 > [2014-11-25 13:35:00.572692] W [socket.c:611:__socket_rwv] 0-management: > readv on /var/run/9bdd01b8b5f546ce04b25ce7d68e3ace.socket failed > (Invalid argument) > [2014-11-25 13:35:00.608802] I [glusterd-pmap.c:227:pmap_registry_bind] > 0-pmap: adding brick /bricks/criteria1/brick on port 49162 > [2014-11-25 13:35:00.781618] I [glusterd-pmap.c:227:pmap_registry_bind] > 0-pmap: adding brick /bricks/admin1/brick on port 49161 > [2014-11-25 13:35:00.972159] I [glusterd-pmap.c:227:pmap_registry_bind] > 0-pmap: adding brick /bricks/integration/epa/brick on port 49159 > [2014-11-25 13:35:01.089892] I [glusterd-pmap.c:227:pmap_registry_bind] > 0-pmap: adding brick /bricks/epa1/brick on port 49160 > [2014-11-25 13:35:01.094994] I > [glusterd-handler.c:2373:__glusterd_handle_friend_update] 0-glusterd: > Received friend update from uuid: 60b2251f-2d69-41a6-91da-fd3f14c5a1e6 > [2014-11-25 13:35:01.095056] I > [glusterd-handler.c:2416:__glusterd_handle_friend_update] 0-management: > Received my uuid as Friend > [2014-11-25 13:35:01.095074] I > [glusterd-handler.c:2416:__glusterd_handle_friend_update] 0-management: > Received my uuid as Friend > [2014-11-25 13:35:01.095310] I > [glusterd-handler.c:3334:glusterd_xfer_friend_add_resp] 0-glusterd: > Responded to gluster1.innova.local (0), ret: 0 > [2014-11-25 13:35:01.134052] I > [glusterd-rpc-ops.c:633:__glusterd_friend_update_cbk] 0-management: > Received ACC from uuid: 60b2251f-2d69-41a6-91da-fd3f14c5a1e6 > [2014-11-25 13:35:01.136058] I > [glusterd-handshake.c:1061:__glusterd_mgmt_hndsk_versions_ack] > 0-management: using the op-version 30501 > [2014-11-25 13:35:01.156932] I > [glusterd-handler.c:2216:__glusterd_handle_incoming_friend_req] > 0-glusterd: Received probe from uuid: 60b2251f-2d69-41a6-91da-fd3f14c5a1e6 > [2014-11-25 13:35:01.157314] I > [glusterd-handler.c:3334:glusterd_xfer_friend_add_resp] 0-glusterd: > Responded to gluster3.innova.local (0), ret: 0 > [2014-11-25 13:35:01.196912] I > [glusterd-handshake.c:1061:__glusterd_mgmt_hndsk_versions_ack] > 0-management: using the op-version 30501 > [2014-11-25 13:35:01.210446] I > [glusterd-rpc-ops.c:633:__glusterd_friend_update_cbk] 0-management: > Received ACC from uuid: 60b2251f-2d69-41a6-91da-fd3f14c5a1e6 > [2014-11-25 13:35:01.211188] I > [glusterd-handler.c:2216:__glusterd_handle_incoming_friend_req] > 0-glusterd: Received probe from uuid: 60b2251f-2d69-41a6-91da-fd3f14c5a1e6 > [2014-11-25 13:35:01.211587] I > [glusterd-handler.c:3334:glusterd_xfer_friend_add_resp] 0-glusterd: > Responded to gluster3.innova.local (0), ret: 0 > [2014-11-25 13:35:01.246642] I > [glusterd-rpc-ops.c:633:__glusterd_friend_update_cbk] 0-management: > Received ACC from uuid: 60b2251f-2d69-41a6-91da-fd3f14c5a1e6 > [2014-11-25 13:35:01.283379] I [glusterd-pmap.c:227:pmap_registry_bind] > 0-pmap: adding brick /bricks/integration/criteria/brick on port 49164 > [2014-11-25 13:35:01.425581] I [glusterd-pmap.c:227:pmap_registry_bind] > 0-pmap: adding brick /bricks/store1/brick on port 49165 > [2014-11-25 13:35:01.529235] I [glusterd-pmap.c:227:pmap_registry_bind] > 0-pmap: adding brick /bricks/forms1/brick on port 49163 > [2014-11-25 13:35:03.573038] W [socket.c:611:__socket_rwv] 0-management: > readv on /var/run/9bdd01b8b5f546ce04b25ce7d68e3ace.socket failed > (Invalid argument) > [2014-11-25 13:35:06.573340] W [socket.c:611:__socket_rwv] 0-management: > readv on /var/run/9bdd01b8b5f546ce04b25ce7d68e3ace.socket failed > (Invalid argument) > [2014-11-25 13:35:07.623762] I > [glusterd-rpc-ops.c:436:__glusterd_friend_add_cbk] 0-glusterd: Received > ACC from uuid: bb61111b-b048-4dc9-b54d-12a0cc2dd8a9, host: > gluster1.innova.local, port: 0 > [2014-11-25 13:35:07.641170] I > [glusterd-handler.c:2373:__glusterd_handle_friend_update] 0-glusterd: > Received friend update from uuid: bb61111b-b048-4dc9-b54d-12a0cc2dd8a9 > [2014-11-25 13:35:07.641226] I > [glusterd-handler.c:2416:__glusterd_handle_friend_update] 0-management: > Received my uuid as Friend > [2014-11-25 13:35:08.041722] I > [glusterd-handler.c:2373:__glusterd_handle_friend_update] 0-glusterd: > Received friend update from uuid: bb61111b-b048-4dc9-b54d-12a0cc2dd8a9 > [2014-11-25 13:35:08.041778] I > [glusterd-handler.c:2416:__glusterd_handle_friend_update] 0-management: > Received my uuid as Friend > [2014-11-25 13:35:08.103189] I > [glusterd-handler.c:2373:__glusterd_handle_friend_update] 0-glusterd: > Received friend update from uuid: bb61111b-b048-4dc9-b54d-12a0cc2dd8a9 > [2014-11-25 13:35:08.103245] I > [glusterd-handler.c:2416:__glusterd_handle_friend_update] 0-management: > Received my uuid as Friend > [2014-11-25 13:35:09.580234] W [socket.c:611:__socket_rwv] 0-management: > readv on /var/run/9bdd01b8b5f546ce04b25ce7d68e3ace.socket failed > (Invalid argument) > [2014-11-25 13:35:12.580567] W [socket.c:611:__socket_rwv] 0-management: > readv on /var/run/9bdd01b8b5f546ce04b25ce7d68e3ace.socket failed > (Invalid argument) > [2014-11-25 13:35:15.580900] W [socket.c:611:__socket_rwv] 0-management: > readv on /var/run/9bdd01b8b5f546ce04b25ce7d68e3ace.socket failed > (Invalid argument) > >> There are few cases related to rebalance commands where we may end up >> having stale locks, have you performed rebalance in between? > > All of the volumes on this server pair are replica 2 volumes, so there > should be nothing to rebalance. I have not explicitly performed any > rebalance commands. > > > > Thanks, > Scott > > > _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-users