Re: Gluster and NFS-Ganesha - cluster is down after reboot

Jiffin Thottan <jthottan@xxxxxxxxxx> · Tue, 6 Jun 2017 00:43:45 -0400 (EDT)

----- Original Message -----
From: "hvjunk" <hvjunk@xxxxxxxxx>
To: "Adam Ru" <ad.ruckel@xxxxxxxxx>
Cc: gluster-users@xxxxxxxxxxx
Sent: Monday, June 5, 2017 9:29:03 PM
Subject: Re:  Gluster and NFS-Ganesha - cluster is down after reboot

Sorry, got sidetracked with invoicing etc. 

https://bitbucket.org/dismyne/gluster-ansibles/src/6df23803df43/ansible/files/?at=master 

The .service files are the stuff going into SystemD, and they call the test-mounts.sh scripts. 
The playbook installing higher up in the directory 

I have submitted patch based on Hendrik scripts/systemd service file https://review.gluster.org/#/c/17339/ 

If everything works out, it can be included in next stable releases of gluster(3.8.13 and 3.10.3)

On 05 Jun 2017, at 17:45 , Adam Ru < ad.ruckel@xxxxxxxxx > wrote: 

Hi hvjunk, 

could you please tell me have you had time to check my previous post? 

Could you please send me mentioned link to your Gluster Ansible scripts? 

Thank you, 

Adam 

On Sun, May 28, 2017 at 2:47 PM, Adam Ru < ad.ruckel@xxxxxxxxx > wrote: 

Hi hvjunk (Hi Hendrik), 

"centos-release-gluster" installs "centos-gluster310". I assume it 
picks the latest version and install it. 

Would you be so kind and send me a link to your script & systemd 
service / Ansible scripts? I cannot find a way how to list your posts 
on lists.gluster.org (I assume it's not possible to list posts of a 
specific user). Or if you cannot find it could you please tell me when 
did you post it? I'll try to find it. 

Meantime I wrote something very simple but I assume your scripting 
skills are better. 

Thank you. 

Kind regards. 

Adam 

---------- 

sudo sh -c 'cat > /root/gluster-run-ganesha << EOF 
#!/bin/bash 

while true; do 
echo "Wait" 
sleep 30 
if [[ -f /var/run/gluster/shared_storage/nfs-ganesha/ganesha-ha.conf 
]]; then 
echo "Start Ganesha" 
systemctl start nfs-ganesha.service 
exit \$? 
else 
echo "Not mounted" 
fi 
done 
EOF' 

sudo chmod +x /root/gluster-run-ganesha 

sudo sh -c 'cat > /etc/systemd/system/custom-gluster-ganesha.service << EOF 
[Unit] 
Description=Start nfs-ganesha when Gluster shared storage is mounted 

[Service] 
Type=oneshot 
ExecStart=/root/gluster-run-ganesha 

[Install] 
WantedBy=multi-user.target 
EOF' 

sudo systemctl enable custom-gluster-ganesha.service 

---------- 

On Mon, May 15, 2017 at 12:27 PM, hvjunk < hvjunk@xxxxxxxxx > wrote: 

On 15 May 2017, at 12:56 PM, Soumya Koduri < skoduri@xxxxxxxxxx > wrote: 

On 05/12/2017 06:27 PM, Adam Ru wrote: 

Hi Soumya, 

Thank you very much for last response – very useful. 

I apologize for delay, I had to find time for another testing. 

I updated instructions that I provided in previous e-mail. *** means 
that the step was added. 

Instructions: 
- Clean installation of CentOS 7.3 with all updates, 3x node, 
resolvable IPs and VIPs 
- Stopped firewalld (just for testing) 
- *** SELinux in permissive mode (I had to, will explain bellow) 
- Install “centos-release-gluster" to get "centos-gluster310" repo 

should I also install the centos-gluster310, or will that be automagically chosen by the centos-release-gluster? 

and install following (nothing else): 
--- glusterfs-server 
--- glusterfs-ganesha 
- Passwordless SSH between all nodes 
(/var/lib/glusterd/nfs/secret.pem and secret.pem.pub on all nodes) 
- systemctl enable and start glusterd 
- gluster peer probe <other nodes> 
- gluster volume set all cluster.enable-shared-storage enable 

After this step, I’ll advise (given my experience in doing this by Ansible) to make sure that the shared filesystem have propagated to all the nodes, as well as the needed entries made in fstab… safety check, and I’ll also load my systemd service and helper script to assist in cluster cold-bootstrapping. 

- systemctl enable and start pcsd.service 
- systemctl enable pacemaker.service (cannot be started at this moment) 
- Set password for hacluster user on all nodes 
- pcs cluster auth <node 1> <node 2> <node 3> -u hacluster -p blabla 
- mkdir /var/run/gluster/shared_storage/nfs-ganesha/ 
- touch /var/run/gluster/shared_storage/nfs-ganesha/ganesha.conf (not 
sure if needed) 
- vi /var/run/gluster/shared_storage/nfs-ganesha/ganesha-ha.conf and 
insert configuration 
- Try list files on other nodes: ls 
/var/run/gluster/shared_storage/nfs-ganesha/ 
- gluster nfs-ganesha enable 
- *** systemctl enable pacemaker.service (again, since pacemaker was 
disabled at this point) 
- *** Check owner of "state", "statd", "sm" and "sm.bak" in 
/var/lib/nfs/ (I had to: chown rpcuser:rpcuser 
/var/lib/nfs/statd/state) 
- Check on other nodes that nfs-ganesha.service is running and "pcs 
status" shows started resources 
- gluster volume create mynewshare replica 3 transport tcp 
node1:/<dir> node2:/<dir> node3:/<dir> 
- gluster volume start mynewshare 
- gluster vol set mynewshare ganesha.enable on 

At this moment, this is status of important (I think) services: 

-- corosync.service disabled 
-- corosync-notifyd.service disabled 
-- glusterd.service enabled 
-- glusterfsd.service disabled 
-- pacemaker.service enabled 
-- pcsd.service enabled 
-- nfs-ganesha.service disabled 
-- nfs-ganesha-config.service static 
-- nfs-ganesha-lock.service static 

-- corosync.service active (running) 
-- corosync-notifyd.service inactive (dead) 
-- glusterd.service active (running) 
-- glusterfsd.service inactive (dead) 
-- pacemaker.service active (running) 
-- pcsd.service active (running) 
-- nfs-ganesha.service active (running) 
-- nfs-ganesha-config.service inactive (dead) 
-- nfs-ganesha-lock.service active (running) 

May I ask you a few questions please? 

1. Could you please confirm that services above has correct status/state? 

Looks good to the best of my knowledge. 

2. When I restart a node then nfs-ganesha is not running. Of course I 
cannot enable it since it needs to be enabled after shared storage is 
mounted. What is best practice to start it automatically so I don’t 
have to worry about restarting node? Should I create a script that 
will check whether shared storage was mounted and then start 
nfs-ganesha? How do you do this in production? 

That's right.. We have plans to address this in near future (probably by having a new .service which mounts shared_storage before starting nfs-ganesha). But until then ..yes having a custom defined script to do so is the only way to automate it. 

Refer to my previous posting that has a script & systemd service that address this problematic bootstrapping issue w.r.t. locally mounted gluster directories, which the shared directory is. 
That could be used (with my permission) as a basis to help fix this issue… 

-- 
Adam 

-- 
Adam 

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users