Sure, Here is what was the setup : [root@ovirt1 ~]# systemctl cat var-run-gluster-shared_storage.mount --no-pager # /run/systemd/generator/var-run-gluster-shared_storage.mount # Automatically generated by systemd-fstab-generator [Unit] SourcePath=/etc/fstab Documentation=man:fstab(5) man:systemd-fstab-generator(8) [Mount] What=gluster1:/gluster_shared_storage Where=/var/run/gluster/shared_storage Type=glusterfs Options=defaults,x-systemd.requires=glusterd.service,x-systemd.automount [root@ovirt1 ~]# systemctl cat var-run-gluster-shared_storage.automount --no-pager # /run/systemd/generator/var-run-gluster-shared_storage.automount # Automatically generated by systemd-fstab-generator [Unit] SourcePath=/etc/fstab Documentation=man:fstab(5) man:systemd-fstab-generator(8) Before=remote-fs.target After=glusterd.service Requires=glusterd.service [Automount] Where=/var/run/gluster/shared_storage [root@ovirt1 ~]# systemctl cat glusterd --no-pager # /etc/systemd/system/glusterd.service [Unit] Description=GlusterFS, a clustered file-system server Requires=rpcbind.service gluster_bricks-engine.mount gluster_bricks-data.mount gluster_bricks-isos.mount After=network.target rpcbind.service gluster_bricks-engine.mount gluster_bricks-data.mount gluster_bricks-isos.mount Before=network-online.target [Service] Type=forking PIDFile=/var/run/glusterd.pid LimitNOFILE=65536 Environment="LOG_LEVEL=INFO" EnvironmentFile=-/etc/sysconfig/glusterd ExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level $LOG_LEVEL $GLUSTERD_OPTIONS KillMode=process SuccessExitStatus=15 [Install] WantedBy=multi-user.target # /etc/systemd/system/glusterd.service.d/99-cpu.conf [Service] CPUAccounting=yes Slice=glusterfs.slice [root@ovirt1 ~]# systemctl cat ctdb --no-pager # /etc/systemd/system/ctdb.service [Unit] Description=CTDB Documentation=man:ctdbd(1) man:ctdb(7) After=network-online.target time-sync.target glusterd.service var-run-gluster-shared_storage.automount Conflicts=var-lib-nfs-rpc_pipefs.mount [Service] Environment=SYSTEMD_LOG_LEVEL=debug Type=forking LimitCORE=infinity PIDFile=/run/ctdb/ctdbd.pid ExecStartPre=/bin/bash -c "sleep 2; if [ -f /sys/fs/cgroup/cpu/system.slice/cpu.rt_runtime_us ]; then echo 10000 > /sys/fs/cgroup/cpu/system.slice/cpu.rt_runtime_us; fi" ExecStartPre=/bin/bash -c 'if [[ $(find /var/log/log.ctdb -type f -size +20971520c 2>/dev/null) ]]; then truncate -s 0 /var/log/log.ctdb;fi' ExecStartPre=/bin/bash -c 'if [ -d "/var/run/gluster/shared_storage/lock" ] ;then exit 4; fi' ExecStart=/usr/sbin/ctdbd_wrapper /run/ctdb/ctdbd.pid start ExecStop=/usr/sbin/ctdbd_wrapper /run/ctdb/ctdbd.pid stop KillMode=control-group Restart=no [Install] WantedBy=multi-user.target [root@ovirt1 ~]# systemctl cat nfs-ganesha --no-pager # /usr/lib/systemd/system/nfs-ganesha.service # This file is part of nfs-ganesha. # # There can only be one NFS-server active on a system. When NFS-Ganesha is # started, the kernel NFS-server should have been stopped. This is achieved by # the 'Conflicts' directive in this unit. # # The Network Locking Manager (rpc.statd) is provided by the nfs-utils package. # NFS-Ganesha comes with its own nfs-ganesha-lock.service to resolve potential # conflicts in starting multiple rpc.statd processes. See the comments in the # nfs-ganesha-lock.service for more details. # [Unit] Description=NFS-Ganesha file server Documentation=http://github.com/nfs-ganesha/nfs-ganesha/wiki After=rpcbind.service nfs-ganesha-lock.service Wants=rpcbind.service nfs-ganesha-lock.service Conflicts=nfs.target After=nfs-ganesha-config.service Wants=nfs-ganesha-config.service [Service] Type=forking Environment="NOFILE=1048576" EnvironmentFile=-/run/sysconfig/ganesha ExecStart=/bin/bash -c "${NUMACTL} ${NUMAOPTS} /usr/bin/ganesha.nfsd ${OPTIONS} ${EPOCH}" ExecStartPost=-/bin/bash -c "prlimit --pid $MAINPID --nofile=$NOFILE:$NOFILE" ExecStartPost=-/bin/bash -c "/usr/bin/sleep 2 && /bin/dbus-send --system --dest=org.ganesha.nfsd --type=method_call /org/ganesha/nfsd/admin org.ganesha.nfsd.admin.init_fds_limit" ExecReload=/bin/kill -HUP $MAINPID ExecStop=/bin/dbus-send --system --dest=org.ganesha.nfsd --type=method_call /org/ganesha/nfsd/admin org.ganesha.nfsd.admin.shutdown [Install] WantedBy=multi-user.target Also=nfs-ganesha-lock.service I can't guarantee that it will work 100% in your setup, but I remmember I had only few hicups after all node powerdown+powerup. P.S.: I still prefer corosync/pacemaker but in my setup I cannot have fencing and in hyperconverged setup it gets even more complex. If your cluster is gluster only - consider pacemaker for that task. Best Regards, Strahil NikolovOn Nov 4, 2019 15:57, Erik Jacobson <erik.jacobson@xxxxxxx> wrote: > > Thank you! I am very interested. I hadn't considered the automounter > idea. > > Also, your fstab has a different dependency approach than mine otherwise > as well. > > If you happen to have the examples handy, I'll give them a shot here. > > I'm looking forward to emerging from this dark place of dependencies not > working!! > > Thank you so much for writing back, > > Erik > > On Mon, Nov 04, 2019 at 06:59:10AM +0200, Strahil wrote: > > Hi Erik, > > > > I took another approach. > > > > 1. I got a systemd mount unit for my ctdb lock volume's brick: > > [root@ovirt1 system]# grep var /etc/fstab > > gluster1:/gluster_shared_storage /var/run/gluster/shared_storage/ glusterfs defaults,x-systemd.requires=glusterd.service,x-systemd.automount 0 0 > > > > As you can see - it is an automounter, because sometimes it fails to mount on time > > > > 2. I got custom systemd services for glusterd,ctdb and vdo - as I need to 'put' dependencies for each of those. > > > > Now, I'm no longer using ctdb & NFS Ganesha (as my version of ctdb cannot use hpstnames and my environment is a little bit crazy), but I can still provide hints how I did it. > > > > Best Regards, > > Strahil NikolovOn Nov 3, 2019 22:46, Erik Jacobson <erik.jacobson@xxxxxxx> wrote: > > > > > > So, I have a solution I have written about in the based that is based on > > > gluster with CTDB for IP and a level of redundancy. > > > > > > It's been working fine except for a few quirks I need to work out on > > > giant clusters when I get access. > > > > > > I have 3x9 gluster volume, each are also NFS servers, using gluster > > > NFS (ganesha isn't reliable for my workload yet). There are 9 IP > > > aliases spread across 9 servers. > > > > > > I also have many bind mounts that point to the shared storage as a > > > source, and the /gluster/lock volume ("ctdb") of course. > > > > > > glusterfs 4.1.6 (rhel8 today, but I use rhel7, rhel8, sles12, and > > > sles15) > > > > > > Things work well when everything is up and running. IP failover works > > > well when one of the servers goes down. My issue is when that server > > > comes back up. Despite my best efforts with systemd fstab dependencies, > > > the shared storage areas including the gluster lock for CTDB do not > > > always get mounted before CTDB starts. This causes trouble for CTDB > > > correctly joining the collective. I also have problems where my > > > bind mounts can happen before the shared storage is mounted, despite my > > > attempts at preventing this with dependencies in fstab. > > > > > > I decided a better approach would be to use a gluster hook and just > > > mount everything I need as I need it, and start up ctdb when I know and > > > verify that /gluster/lock is really gluster and not a local disk. > > > > > > I started down a road of doing this with a start host hook and after > > > spending a while at it, I realized my logic error. This will only fire > > > when the volume is *started*, not when a server that was down re-joins. > > > > > > I took a look at the code, glusterd-hooks.c, and found that support > > > for "brick start" is not in place for a hook script but it's nearly > > > there: > > > > > > [GD_OP_START_BRICK] = EMPTY, > > > ... > > > > > > and no entry in glusterd_hooks_add_op_args() yet. > > > > > > > > > Before I make a patch for my own use, I wanted to do a sanity check and > > > find out if others have solved this better than the road I'm heading > > > down. > > > > > > What I was thinking of doing is enabling a brick start hook, and > > > do my processing for volumes being mounted from there. However, I > > > suppose brick start is a bad choice for the case of simply stopping and > > > starting the volume, because my processing would try to complete before > > > the gluster volume was fully started. It would probably work for a brick > > > "coming back and joining" but not "stop volume/start volume". > > > > > > Any suggestions? > > > > > > My end goal is: > > > - mount shared storage every boot > > > - only attempt to mount when gluster is available (_netdev doesn't seem > > > to be enough) > > > - never start ctdb unless /gluster/lock is a shared storage and not a > > > directory. > > > - only do my bind mounts from shared storage in to the rest of the > > > layout when we are sure the shared storage is mounted (don't > > > bind-mount using an empty directory as a source by accident!) > > > > > > Thanks so much for reading my question, > > > > > > Erik > > > ________ > > > > > > Community Meeting Calendar: > > > > > > APAC Schedule - > > > Every 2nd and 4th Tuesday at 11:30 AM IST > > > Bridge: https://bluejeans.com/118564314 ; > > > > > > NA/EMEA Schedule - > > > Every 1st and 3rd Tuesday at 01:00 PM EDT > > > Bridge: https://bluejeans.com/118564314 ; > > > > > > Gluster-users mailing list > > > Gluster-users@xxxxxxxxxxx > > > https://lists.gluster.org/mailman/listinfo/gluster-users ; > > > Erik Jacobson > Software Engineer > > erik.jacobson@xxxxxxx > +1 612 851 0550 Office > > Eagan, MN > hpe.com ________ Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/118564314 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/118564314 Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users