Reliably mounting a gluster volume

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi everyone,

For the past few days I've been experimenting with Gluster and systemd. The issue I'm trying to solve is that my gluster servers always fail to self-mount their gluster volume locally on boot. Apparently this is because the mount happens right after glusterd has been started, but before it is ready to serve the volume.

I'm doing a refresh of our internal gluster based KVM system, bringing it to Ubuntu 16.04LTS. As the Ubuntu gluster package as shipped still has this boot/mount issue, and to simplify things a bit, I've removed all SystemV and Upstart that ships with the current Ubuntu Gluster package, aiming for a systemd-only solution. Ubuntu 16.04LTS uses systemd.

The problem, in my opinion, stems from the fact that in the Unit file for glusterd, it is declared as a 'forking' kind of service. This means that as soon as the double fork happens, systemd has no option but to consider the service as available, and continues with the rest of its work. I try to delay the mounting of my /gluster by adding "x-systemd.requires=glusterd.service" but for the reasons above, that still causes the mount to happen immediately after glusterd has started, and then the mount fails.

Is there a way for systemd to know when the gluster service is actually able to service a mount request, so one can delay this step of the boot process?

In the Unit file, I have:
[Unit]
Requires=rpcbind.service
After=network.target rpcbind.service network-online.target

The curious thing is that, according to gluster.log, the gluster client does find out on which hostnames the subvolumes are available. However, it seems that talking to both the local (0-gv0-client-0) as remote (0-gv0-client-1) fails. For the service on localhost, the error is 'failed to get the port number for remote subvolume'. For the remote volume, it is 'no route to host'. But at this stage, local networking (which is fully static and on the same network) should already be up.

Some error messages during the mount:

[12:15:50.749137] E [MSGID: 114058] [client-handshake.c:1524:client_query_portmap_cbk] 0-gv0-client-0: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running. [12:15:50.749178] I [MSGID: 114018] [client.c:2042:client_rpc_notify] 0-gv0-client-0: disconnected from gv0-client-0. Client process will keep trying to connect to glusterd until brick's port is available [12:15:53.679570] E [socket.c:2278:socket_connect_finish] 0-gv0-client-1: connection to 10.0.0.3:24007 failed (No route to host) [12:15:53.679611] E [MSGID: 108006] [afr-common.c:3880:afr_notify] 0-gv0-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up.

Once the machine has fully booted and I log in, simply typing 'mount /gluster' always succeeds. I would really appreciate your help in making this happening on boot without intervention.

Regards, Paul Boven.
--
Paul Boven <boven@xxxxxxx> +31 (0)521-596547
Unix/Linux/Networking specialist
Joint Institute for VLBI in Europe - www.jive.eu
VLBI - It's a fringe science
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users



[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux