Hi everyone,
For the past few days I've been experimenting with Gluster and systemd.
The issue I'm trying to solve is that my gluster servers always fail to
self-mount their gluster volume locally on boot. Apparently this is
because the mount happens right after glusterd has been started, but
before it is ready to serve the volume.
I'm doing a refresh of our internal gluster based KVM system, bringing
it to Ubuntu 16.04LTS. As the Ubuntu gluster package as shipped still
has this boot/mount issue, and to simplify things a bit, I've removed
all SystemV and Upstart that ships with the current Ubuntu Gluster
package, aiming for a systemd-only solution. Ubuntu 16.04LTS uses systemd.
The problem, in my opinion, stems from the fact that in the Unit file
for glusterd, it is declared as a 'forking' kind of service. This means
that as soon as the double fork happens, systemd has no option but to
consider the service as available, and continues with the rest of its
work. I try to delay the mounting of my /gluster by adding
"x-systemd.requires=glusterd.service" but for the reasons above, that
still causes the mount to happen immediately after glusterd has started,
and then the mount fails.
Is there a way for systemd to know when the gluster service is actually
able to service a mount request, so one can delay this step of the boot
process?
In the Unit file, I have:
[Unit]
Requires=rpcbind.service
After=network.target rpcbind.service network-online.target
The curious thing is that, according to gluster.log, the gluster client
does find out on which hostnames the subvolumes are available. However,
it seems that talking to both the local (0-gv0-client-0) as remote
(0-gv0-client-1) fails. For the service on localhost, the error is
'failed to get the port number for remote subvolume'. For the remote
volume, it is 'no route to host'. But at this stage, local networking
(which is fully static and on the same network) should already be up.
Some error messages during the mount:
[12:15:50.749137] E [MSGID: 114058]
[client-handshake.c:1524:client_query_portmap_cbk] 0-gv0-client-0:
failed to get the port number for remote subvolume. Please run 'gluster
volume status' on server to see if brick process is running.
[12:15:50.749178] I [MSGID: 114018] [client.c:2042:client_rpc_notify]
0-gv0-client-0: disconnected from gv0-client-0. Client process will keep
trying to connect to glusterd until brick's port is available
[12:15:53.679570] E [socket.c:2278:socket_connect_finish]
0-gv0-client-1: connection to 10.0.0.3:24007 failed (No route to host)
[12:15:53.679611] E [MSGID: 108006] [afr-common.c:3880:afr_notify]
0-gv0-replicate-0: All subvolumes are down. Going offline until atleast
one of them comes back up.
Once the machine has fully booted and I log in, simply typing 'mount
/gluster' always succeeds. I would really appreciate your help in making
this happening on boot without intervention.
Regards, Paul Boven.
--
Paul Boven <boven@xxxxxxx> +31 (0)521-596547
Unix/Linux/Networking specialist
Joint Institute for VLBI in Europe - www.jive.eu
VLBI - It's a fringe science
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users