Re: Problems with Gluster

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 08/04/2014 02:56 AM, McKenzie, Stan wrote:
Hi Brad --

Thanks for the response.  I've tried what you recommended on node40 and here is the output:


==> /var/log/glusterfs/etc-glusterfs-glusterd.vol.log <==
[2014-08-03 09:25:01.405662] I [glusterfsd.c:1493:main] 0-/opt/glusterfs/3.2.5/sbin/glusterd: Started running /opt/glusterfs/3.2.5/sbin/glusterd version 3.2.5
[2014-08-03 09:25:01.408622] I [glusterd.c:550:init] 0-management: Using /etc/glusterd as working directory
[2014-08-03 09:25:01.410117] E [rpc-transport.c:677:rpc_transport_load] 0-rpc-transport: /opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/rpc-transport/rdma.so: cannot open shared object file: No such file or directory
[2014-08-03 09:25:01.410141] E [rpc-transport.c:681:rpc_transport_load] 0-rpc-transport: volume 'rdma.management': transport-type 'rdma' is not valid or not found on this machine
[2014-08-03 09:25:01.410156] W [rpcsvc.c:1288:rpcsvc_transport_create] 0-rpc-service: cannot create listener, initing the transport failed
[2014-08-03 09:25:01.410272] I [glusterd.c:88:glusterd_uuid_init] 0-glusterd: retrieved UUID: 7690fd99-5ed4-4a45-bb3d-7ab54831b543
[2014-08-03 09:25:57.414360] E [common-utils.c:125:gf_resolve_ip6] 0-resolver: getaddrinfo failed (Name or service not known)
[2014-08-03 09:25:57.414408] E [name.c:253:af_inet_client_get_remote_sockaddr] 0-management: DNS resolution failed on host nodei.localdomain

What DNS server do these machines use to resolve the addresses of the other nodes? Can you confirm the DNS is resolving the names correctly?

Continued below.

pending frames:

patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 2014-08-03 09:25:57
configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.2.5
/lib64/libc.so.6[0x30844302d0]
/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/mgmt/glusterd.so(glusterd_friend_sm+0x27)[0x2b131fbcd877]
/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/mgmt/glusterd.so(glusterd_peer_rpc_notify+0x1b4)[0x2b131fbb9a04]
/opt/glusterfs/3.2.5/lib64/libgfrpc.so.0(rpc_clnt_start+0x17)[0x2b131df7e8a7]
/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/mgmt/glusterd.so(glusterd_rpc_create+0xff)[0x2b131fbba74f]
/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/mgmt/glusterd.so(glusterd_friend_add+0x2d7)[0x2b131fbbaad7]
/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/mgmt/glusterd.so(glusterd_store_retrieve_peers+0x3d2)[0x2b131fbff4c2]
/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/mgmt/glusterd.so(glusterd_restore+0x78)[0x2b131fc00dd8]
/opt/glusterfs/3.2.5/lib64/glusterfs/3.2.5/xlator/mgmt/glusterd.so(init+0xd12)[0x2b131fbb6312]
/opt/glusterfs/3.2.5/lib64/libglusterfs.so.0(xlator_init+0x58)[0x2b131dd19488]
/opt/glusterfs/3.2.5/lib64/libglusterfs.so.0(glusterfs_graph_init+0x31)[0x2b131dd48501]
/opt/glusterfs/3.2.5/lib64/libglusterfs.so.0(glusterfs_graph_activate+0x88)[0x2b131dd48688]
/opt/glusterfs/3.2.5/sbin/glusterd(glusterfs_process_volfp+0x103)[0x404033]
/opt/glusterfs/3.2.5/sbin/glusterd(glusterfs_volumes_init+0x18b)[0x40424b]
/opt/glusterfs/3.2.5/sbin/glusterd(main+0x419)[0x405299]
/lib64/libc.so.6(__libc_start_main+0xf4)[0x308441d9c4]
/opt/glusterfs/3.2.5/sbin/glusterd[0x403649]

This looks like https://bugzilla.redhat.com/show_bug.cgi?id=787516 which is a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=786006 and one of the potential causes is stated as
"- glusterd attempts to restore all the peers.
- One of the peer's ip/hostname is unreachable."

Once again this may point to a DNS name resolution issue so please investigate that thoroughly.

Cheers,
Brad

---------

-----Original Message-----
From: Brad Hubbard [mailto:bhubbard@xxxxxxxxxx]
Sent: Saturday, August 02, 2014 4:37 PM
To: McKenzie, Stan; gluster-users@xxxxxxxxxxx
Subject: Re:  Problems with Gluster

On 08/02/2014 01:33 AM, McKenzie, Stan wrote:


*When I ssh to some nodes I get an error "-bash:
/act/Modules/3.2.6/init/bash:  No such file or directory

-bash: module:  command not found".   On other nodes when I ssh I get
normal login.

Have you verified the file "/act/Modules/3.2.6/init/bash" exists on each peer and is not corrupted/truncated?

You could also try something like this on node40.

# tail -f /var/log/glusterfs/*.log &

Ignore anything output until after you run the following.

# service glusterd start

Paste the output somewhere we can view it if it's too large to post in an email.

Cheers,
Brad




--

Kindest Regards,

Brad Hubbard
Senior Software Maintenance Engineer
Red Hat Global Support Services
Asia Pacific Region
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users




[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux