Re: DNS resolution failure at boot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I’m actually not sure what you’re talking about here. Was it the reference to x-systemd.requires? Where would that go?

Regardless, the network devices are up and configured, I can add an ExecStartPre to the glusterd unit file that reads /bin/bash -c “while ! dig +short {name}.node.consul; sleep 1; done”. I added one of those for each host it needed to contact, and the systemd output file showed that the names were resolved before glusterd started up, but glusterd failed with the same error about not being able to resolve.

On 4 Dec 2015, at 11:01, Atin Mukherjee wrote:

You wouldn't need vdsm service here as the mail thread was for an ovirt use
case. Have you tried changing the service file following what Kaushal
mentioned in that mail?

-Atin
Sent from one plus one
On Dec 4, 2015 10:27 PM, "Brian Hicks" brian@xxxxxxxx wrote:

Ah, just tried it on some fresh machines. Looks like the solution that
worked there isn’t making my cluster any happier. Any other thoughts?

(to be clear, looks like that was adding vdsmd-network.service as an After
target, and vdsmd.service as a Before target)

On 4 Dec 2015, at 10:06, Atin Mukherjee wrote:

You might be experiencing this:
https://www.gluster.org/pipermail/gluster-users/2015-November/024292.html

-Atin
Sent from one plus one
On Dec 4, 2015 9:07 PM, "Brian Hicks" brian@xxxxxxxx wrote:

Hi all,

I’m running Gluster 3.7.6 on Centos 7.1, and using Consul for DNS (for
example, putting all the glusterd servers at glusterfs.service.consul.)

I’m seeing odd behavior when I reboot the nodes running glusterd.
Basically, it doesn’t seem to be able to resolve names at boot. I have the
default settings as well as using a systemd drop-in file to make sure that
glusterd starts after DNS is active (nothing complex, just After and
Require for consul and dnsmasq.) I’ve even tried adding an ExecStartPre
with a bash while loop that runs until dig can resolve the addresses listed
in the log file below. Nothing seems to help, my
etc-glusterfs-glusterd.vol.log always contains these lines, and glusterd
fails to start.

Oddly, if I run systemctl start glusterd after the boot process completes,
it starts just fine. Is there some other network target I need to include
in my systemd unit file?

[2015-12-02 22:50:17.493630] I [MSGID: 100030] [glusterfsd.c:2318:main]
0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.7.6
(args: /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO)
[2015-12-02 22:50:17.916025] I [MSGID: 106478] [glusterd.c:1350:init]
0-management: Maximum allowed open file descriptors set to 65536
[2015-12-02 22:50:17.916063] I [MSGID: 106479] [glusterd.c:1399:init]
0-management: Using /var/lib/glusterd as working directory
[2015-12-02 22:50:17.980724] E [rpc-transport.c:292:rpc_transport_load]
0-rpc-transport: /usr/lib64/glusterfs/3.7.6/rpc-transport/rdma.so: cannot
open shared object file: No such file or directory
[2015-12-02 22:50:17.980743] W [rpc-transport.c:296:rpc_transport_load]
0-rpc-transport: volume 'rdma.management': transport-type 'rdma' is not
valid or not found on this machine
[2015-12-02 22:50:17.980753] W [rpcsvc.c:1597:rpcsvc_transport_create]
0-rpc-service: cannot create listener, initing the transport failed
[2015-12-02 22:50:17.980762] E [MSGID: 106243] [glusterd.c:1623:init]
0-management: creation of 1 listeners failed, continuing with succeeded
transport
[2015-12-02 22:50:18.605503] I [MSGID: 106228]
[glusterd.c:433:glusterd_check_gsync_present] 0-glusterd: geo-replication
module not installed in the system [No such file or directory]
[2015-12-02 22:50:18.669326] I [MSGID: 106513]
[glusterd-store.c:2047:glusterd_restore_op_version] 0-glusterd: retrieved
op-version: 30706
[2015-12-02 22:50:27.786383] I [MSGID: 106498]
[glusterd-handler.c:3579:glusterd_friend_add_from_peerinfo] 0-management:
connect returned 0
[2015-12-02 22:50:27.809153] I [rpc-clnt.c:984:rpc_clnt_connection_init]
0-management: setting frame-timeout to 600
[2015-12-02 22:50:27.809078] I [MSGID: 106498]
[glusterd-handler.c:3579:glusterd_friend_add_from_peerinfo] 0-management:
connect returned 0
[2015-12-02 22:50:37.844756] E [MSGID: 101075]
[common-utils.c:306:gf_resolve_ip6] 0-resolver: getaddrinfo failed (Name or
service not known)
[2015-12-02 22:50:37.844822] E
[name.c:247:af_inet_client_get_remote_sockaddr] 0-management: DNS
resolution failed on host resching-os-control-02.node.consul
[2015-12-02 22:50:37.845167] I [rpc-clnt.c:984:rpc_clnt_connection_init]
0-management: setting frame-timeout to 600
[2015-12-02 22:50:37.845259] I [MSGID: 106004]
[glusterd-handler.c:5065:__glusterd_peer_rpc_notify] 0-management: Peer
<resching-os-control-02.node.consul>
(<9cf99313-dd68-4ac7-acbb-b018cc167ec2>), in state <Peer in Cluster>, has
disconnected from glusterd.
[2015-12-02 22:50:37.845321] E [MSGID: 106155]
[glusterd-utils.c:199:glusterd_unlock] 0-management: Cluster lock not held!
[2015-12-02 22:50:47.880585] E [MSGID: 101075]
[common-utils.c:306:gf_resolve_ip6] 0-resolver: getaddrinfo failed (Name or
service not known)
[2015-12-02 22:50:47.880675] E
[name.c:247:af_inet_client_get_remote_sockaddr] 0-management: DNS
resolution failed on host resching-os-control-01.node.consul
[2015-12-02 22:50:47.880870] I [MSGID: 106004]
[glusterd-handler.c:5065:__glusterd_peer_rpc_notify] 0-management: Peer
<resching-os-control-01.node.consul>
(<cc7ced64-e3c2-403d-ae01-59ad3f68d6e6>), in state <Peer in Cluster>, has
disconnected from glusterd.
[2015-12-02 22:50:47.880910] E [MSGID: 106155]
[glusterd-utils.c:199:glusterd_unlock] 0-management: Cluster lock not held!
[2015-12-02 22:50:51.583949] E [MSGID: 101075]
[common-utils.c:306:gf_resolve_ip6] 0-resolver: getaddrinfo failed (Name or
service not known)
[2015-12-02 22:50:51.584013] E
[name.c:247:af_inet_client_get_remote_sockaddr] 0-management: DNS
resolution failed on host resching-os-control-02.node.consul
[2015-12-02 22:50:51.584159] I [MSGID: 106004]
[glusterd-handler.c:5065:__glusterd_peer_rpc_notify] 0-management: Peer
<resching-os-control-02.node.consul>
(<9cf99313-dd68-4ac7-acbb-b018cc167ec2>), in state <Peer in Cluster>, has
disconnected from glusterd.
[2015-12-02 22:50:57.917351] E [MSGID: 106408]
[glusterd-peer-utils.c:120:glusterd_peerinfo_find_by_hostname]
0-management: error in getaddrinfo: Name or service not known
[Unknown error -2]
[2015-12-02 22:51:02.605954] E [MSGID: 101075]
[common-utils.c:306:gf_resolve_ip6] 0-resolver: getaddrinfo failed (Name or
service not known)
[2015-12-02 22:51:02.605990] E
[name.c:247:af_inet_client_get_remote_sockaddr] 0-management: DNS
resolution failed on host resching-os-control-01.node.consul
[2015-12-02 22:51:02.606077] I [MSGID: 106004]
[glusterd-handler.c:5065:__glusterd_peer_rpc_notify] 0-management: Peer
<resching-os-control-01.node.consul>
(<cc7ced64-e3c2-403d-ae01-59ad3f68d6e6>), in state <Peer in Cluster>, has
disconnected from glusterd.
[2015-12-02 22:51:07.938471] E [MSGID: 101075]
[common-utils.c:3127:gf_is_local_addr] 0-management: error in getaddrinfo:
Name or service not known

[2015-12-02 22:51:07.938526] E [MSGID: 106187]
[glusterd-store.c:4266:glusterd_resolve_all_bricks] 0-glusterd: resolve
brick failed in restore
[2015-12-02 22:51:07.938559] E [MSGID: 101019] [xlator.c:428:xlator_init]
0-management: Initialization of volume 'management' failed, review your
volfile again
[2015-12-02 22:51:07.938571] E [graph.c:322:glusterfs_graph_init]
0-management: initializing translator failed
[2015-12-02 22:51:07.938579] E [graph.c:661:glusterfs_graph_activate]
0-graph: init failed
[2015-12-02 22:51:07.947613] W glusterfsd.c:1236:cleanup_and_exit
[0x7fda0f9fc24d] -->/usr/sbin/glusterd(glusterfs_process_volfp+0x126)
[0x7fda0f9fc0f6] -->/usr/sbin/glusterd(cleanup_and_exit+0x69)
[0x7fda0f9fb6d9] ) 0-: received signum (0), shutting down

Thanks,

Brian Hicks

Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux