Re: hook script question related to ctdb, shared storage, and bind mounts

Erik Jacobson <erik.jacobson@xxxxxxx> · Mon, 4 Nov 2019 07:57:59 -0600

Thank you! I am very interested. I hadn't considered the automounter
idea.

Also, your fstab has a different dependency approach than mine otherwise
as well.

If you happen to have the examples handy, I'll give them a shot here.

I'm looking forward to emerging from this dark place of dependencies not
working!!

Thank you so much for writing back,

Erik

On Mon, Nov 04, 2019 at 06:59:10AM +0200, Strahil wrote:
> Hi Erik,
> 
> I took another approach.
> 
> 1.  I got a systemd mount unit for my ctdb lock volume's brick:
> [root@ovirt1 system]# grep var /etc/fstab
> gluster1:/gluster_shared_storage /var/run/gluster/shared_storage/ glusterfs defaults,x-systemd.requires=glusterd.service,x-systemd.automount        0 0
> 
> As you can see - it is an automounter, because sometimes it fails to mount on time
> 
> 2.  I got custom systemd services for glusterd,ctdb and vdo -  as I need to 'put' dependencies for each of those.
> 
> Now, I'm no longer using ctdb & NFS Ganesha (as my version of ctdb cannot use hpstnames and my environment is a little bit crazy), but I can still provide hints how I did it.
> 
> Best Regards,
> Strahil NikolovOn Nov 3, 2019 22:46, Erik Jacobson <erik.jacobson@xxxxxxx> wrote:
> >
> > So, I have a solution I have written about in the based that is based on 
> > gluster with CTDB for IP and a level of redundancy. 
> >
> > It's been working fine except for a few quirks I need to work out on 
> > giant clusters when I get access. 
> >
> > I have 3x9 gluster volume, each are also NFS servers, using gluster 
> > NFS (ganesha isn't reliable for my workload yet). There are 9 IP 
> > aliases spread across 9 servers. 
> >
> > I also have many bind mounts that point to the shared storage as a 
> > source, and the /gluster/lock volume ("ctdb") of course. 
> >
> > glusterfs 4.1.6 (rhel8 today, but I use rhel7, rhel8, sles12, and 
> > sles15) 
> >
> > Things work well when everything is up and running. IP failover works 
> > well when one of the servers goes down. My issue is when that server 
> > comes back up. Despite my best efforts with systemd fstab dependencies, 
> > the shared storage areas including the gluster lock for CTDB do not 
> > always get mounted before CTDB starts. This causes trouble for CTDB 
> > correctly joining the collective. I also have problems where my 
> > bind mounts can happen before the shared storage is mounted, despite my 
> > attempts at preventing this with dependencies in fstab. 
> >
> > I decided a better approach would be to use a gluster hook and just 
> > mount everything I need as I need it, and start up ctdb when I know and 
> > verify that /gluster/lock is really gluster and not a local disk. 
> >
> > I started down a road of doing this with a start host hook and after 
> > spending a while at it, I realized my logic error. This will only fire 
> > when the volume is *started*, not when a server that was down re-joins. 
> >
> > I took a look at the code, glusterd-hooks.c, and found that support 
> > for "brick start" is not in place for a hook script but it's nearly 
> > there: 
> >
> >         [GD_OP_START_BRICK]             = EMPTY, 
> > ... 
> >
> > and no entry in glusterd_hooks_add_op_args() yet. 
> >
> >
> > Before I make a patch for my own use, I wanted to do a sanity check and 
> > find out if others have solved this better than the road I'm heading 
> > down. 
> >
> > What I was thinking of doing is enabling a brick start hook, and 
> > do my processing for volumes being mounted from there. However, I 
> > suppose brick start is a bad choice for the case of simply stopping and 
> > starting the volume, because my processing would try to complete before 
> > the gluster volume was fully started. It would probably work for a brick 
> > "coming back and joining" but not "stop volume/start volume". 
> >
> > Any suggestions? 
> >
> > My end goal is: 
> > - mount shared storage every boot 
> > - only attempt to mount when gluster is available (_netdev doesn't seem 
> >    to be enough) 
> > - never start ctdb unless /gluster/lock is a shared storage and not a 
> >    directory. 
> > - only do my bind mounts from shared storage in to the rest of the 
> >    layout when we are sure the shared storage is mounted (don't 
> >    bind-mount using an empty directory as a source by accident!) 
> >
> > Thanks so much for reading my question, 
> >
> > Erik 
> > ________ 
> >
> > Community Meeting Calendar: 
> >
> > APAC Schedule - 
> > Every 2nd and 4th Tuesday at 11:30 AM IST 
> > Bridge: https://bluejeans.com/118564314  
> >
> > NA/EMEA Schedule - 
> > Every 1st and 3rd Tuesday at 01:00 PM EDT 
> > Bridge: https://bluejeans.com/118564314  
> >
> > Gluster-users mailing list 
> > Gluster-users@xxxxxxxxxxx 
> > https://lists.gluster.org/mailman/listinfo/gluster-users  

Erik Jacobson
Software Engineer

erik.jacobson@xxxxxxx
+1 612 851 0550 Office

Eagan, MN
hpe.com
________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users