On 08/04/2014 02:56 AM, McKenzie, Stan wrote:
Hi Brad --
Thanks for the response. I've tried what you recommended on node40 and here is the output:
==> /var/log/glusterfs/etc-glusterfs-glusterd.vol.log <==
[2014-08-03 09:25:01.405662] I [glusterfsd.c:1493:main] 0-/opt/glusterfs/3.2.5/sbin/glusterd: Started running /opt/glusterfs/3.2.5/sbin/glusterd version 3.2.5
[2014-08-03 09:25:01.408622] I [glusterd.c:550:init] 0-management: Using /etc/glusterd as working directory
[2014-08-03 09:25:01.410117] E [rpc-transport.c:677:rpc_transport_load] 0-rpc-transport: /opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/rpc-transport/rdma.so: cannot open shared object file: No such file or directory
[2014-08-03 09:25:01.410141] E [rpc-transport.c:681:rpc_transport_load] 0-rpc-transport: volume 'rdma.management': transport-type 'rdma' is not valid or not found on this machine
[2014-08-03 09:25:01.410156] W [rpcsvc.c:1288:rpcsvc_transport_create] 0-rpc-service: cannot create listener, initing the transport failed
[2014-08-03 09:25:01.410272] I [glusterd.c:88:glusterd_uuid_init] 0-glusterd: retrieved UUID: 7690fd99-5ed4-4a45-bb3d-7ab54831b543
[2014-08-03 09:25:57.414360] E [common-utils.c:125:gf_resolve_ip6] 0-resolver: getaddrinfo failed (Name or service not known)
[2014-08-03 09:25:57.414408] E [name.c:253:af_inet_client_get_remote_sockaddr] 0-management: DNS resolution failed on host nodei.localdomain
What DNS server do these machines use to resolve the addresses of the
other nodes? Can you confirm the DNS is resolving the names correctly?
Continued below.
pending frames:
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 2014-08-03 09:25:57
configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.2.5
/lib64/libc.so.6[0x30844302d0]
/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/mgmt/glusterd.so(glusterd_friend_sm+0x27)[0x2b131fbcd877]
/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/mgmt/glusterd.so(glusterd_peer_rpc_notify+0x1b4)[0x2b131fbb9a04]
/opt/glusterfs/3.2.5/lib64/libgfrpc.so.0(rpc_clnt_start+0x17)[0x2b131df7e8a7]
/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/mgmt/glusterd.so(glusterd_rpc_create+0xff)[0x2b131fbba74f]
/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/mgmt/glusterd.so(glusterd_friend_add+0x2d7)[0x2b131fbbaad7]
/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/mgmt/glusterd.so(glusterd_store_retrieve_peers+0x3d2)[0x2b131fbff4c2]
/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/mgmt/glusterd.so(glusterd_restore+0x78)[0x2b131fc00dd8]
/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/mgmt/glusterd.so(init+0xd12)[0x2b131fbb6312]
/opt/glusterfs/3.2.5/lib64/libglusterfs.so.0(xlator_init+0x58)[0x2b131dd19488]
/opt/glusterfs/3.2.5/lib64/libglusterfs.so.0(glusterfs_graph_init+0x31)[0x2b131dd48501]
/opt/glusterfs/3.2.5/lib64/libglusterfs.so.0(glusterfs_graph_activate+0x88)[0x2b131dd48688]
/opt/glusterfs/3.2.5/sbin/glusterd(glusterfs_process_volfp+0x103)[0x404033]
/opt/glusterfs/3.2.5/sbin/glusterd(glusterfs_volumes_init+0x18b)[0x40424b]
/opt/glusterfs/3.2.5/sbin/glusterd(main+0x419)[0x405299]
/lib64/libc.so.6(__libc_start_main+0xf4)[0x308441d9c4]
/opt/glusterfs/3.2.5/sbin/glusterd[0x403649]
This looks like https://bugzilla.redhat.com/show_bug.cgi?id=787516 which
is a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=786006 and
one of the potential causes is stated as
"- glusterd attempts to restore all the peers.
- One of the peer's ip/hostname is unreachable."
Once again this may point to a DNS name resolution issue so please
investigate that thoroughly.
Cheers,
Brad
---------
-----Original Message-----
From: Brad Hubbard [mailto:bhubbard@xxxxxxxxxx]
Sent: Saturday, August 02, 2014 4:37 PM
To: McKenzie, Stan; gluster-users@xxxxxxxxxxx
Subject: Re: Problems with Gluster
On 08/02/2014 01:33 AM, McKenzie, Stan wrote:
*When I ssh to some nodes I get an error "-bash:
/act/Modules/3.2.6/init/bash: No such file or directory
-bash: module: command not found". On other nodes when I ssh I get
normal login.
Have you verified the file "/act/Modules/3.2.6/init/bash" exists on each peer and is not corrupted/truncated?
You could also try something like this on node40.
# tail -f /var/log/glusterfs/*.log &
Ignore anything output until after you run the following.
# service glusterd start
Paste the output somewhere we can view it if it's too large to post in an email.
Cheers,
Brad
--
Kindest Regards,
Brad Hubbard
Senior Software Maintenance Engineer
Red Hat Global Support Services
Asia Pacific Region
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users