Hi list, I'm having some issues setting up a GFS mount w/ NFS export on RHEL4 using the latest cluster suite packages from RHN. I'm using GFS CVS (RHEL4) and LVM2 (clvmd) from source tarball (2.2.01.09) if that makes any difference. The problem I am having is this: I setup a service with a GFS resource, an NFS export resource and an NFS client resource. The service starts fine and I can mount the NFS export over the network from clients. After one minute and each minute after that I'm seeing some errors in my logs and the service is restarted. I looked and clusterfs.sh and saw that it's supposed to be doing a "isMounted" check every minute... but how is that failing if I can access everything just fine, locally and over NFS? Here is the error as I am seeing it in /var/log/messages: Nov 9 10:00:59 wolverine clurgmgrd[6901]: <notice> status on clusterfs "people" returned 1 (generic error) Nov 9 10:00:59 wolverine clurgmgrd[6901]: <notice> Stopping service NFS people Nov 9 10:00:59 wolverine clurgmgrd: [6901]: <info> Removing IPv4 address 136.159.***.*** from eth0 Nov 9 10:00:59 wolverine clurgmgrd: [6901]: <info> Removing export: 136.159.***.0/24:/people Nov 9 10:00:59 wolverine clurgmgrd: [6901]: <info> unmounting /dev/mapper/BIOCOMP-people (/people) Nov 9 10:00:59 wolverine clurgmgrd[6901]: <notice> Service NFS people is recovering Nov 9 10:00:59 wolverine clurgmgrd[6901]: <notice> Recovering failed service NFS people Nov 9 10:01:00 wolverine kernel: GFS: Trying to join cluster "lock_nolock", "" Nov 9 10:01:00 wolverine kernel: GFS: fsid=dm-1.0: Joined cluster. Now mounting FS... Nov 9 10:01:00 wolverine kernel: GFS: fsid=dm-1.0: jid=0: Trying to acquire journal lock... Nov 9 10:01:00 wolverine kernel: GFS: fsid=dm-1.0: jid=0: Looking at journal... Nov 9 10:01:00 wolverine kernel: GFS: fsid=dm-1.0: jid=0: Done Nov 9 10:01:00 wolverine clurmtabd[27592]: <err> #20: Failed set log level Nov 9 10:01:00 wolverine clurgmgrd: [6901]: <info> Adding export: 136.159.***.0/24:/people (rw,sync) Nov 9 10:01:00 wolverine clurgmgrd: [6901]: <info> Adding IPv4 address 136.159.***.*** to eth0 Nov 9 10:01:01 wolverine clurgmgrd[6901]: <notice> Service NFS people started And here is my cluster.conf file: <?xml version="1.0"?> <cluster config_version="28" name="biocomp_cluster"> <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/> <clusternodes> <clusternode name="wolverine" votes="1"> <fence> <method name="1"> <device name="apcfence" port="1" switch="0"/> </method> </fence> </clusternode> <clusternode name="skunk" votes="1"> <fence> <method name="1"> <device name="apcfence" port="2" switch="0"/> </method> </fence> </clusternode> <clusternode name="cottontail" votes="1"> <fence> <method name="1"> <device name="apcfence" port="3" switch="0"/> </method> </fence> </clusternode> </clusternodes> <cman/> <fencedevices> <fencedevice agent="fence_apc" ipaddr="10.1.1.54" login="fence_user" name="apcfence" passwd="*****"/> </fencedevices> <rm> <failoverdomains> <failoverdomain name="NFS Failover" ordered="1" restricted="1"> <failoverdomainnode name="wolverine" priority="3"/> <failoverdomainnode name="skunk" priority="2"/> <failoverdomainnode name="cottontail" priority="1"/> </failoverdomain> <failoverdomain name="Cluster Failover" ordered="0" restricted="1"> <failoverdomainnode name="wolverine" priority="1"/> <failoverdomainnode name="skunk" priority="1"/> <failoverdomainnode name="cottontail" priority="1"/> </failoverdomain> </failoverdomains> <resources> <clusterfs device="/dev/BIOCOMP/people" force_unmount="1" fstype="gfs" mountpoint="/people" name="people" options=""/> <nfsclient name="people-client" options="rw,sync" target="136.159.***.0/24"/> <nfsexport name="people-export"/> <nfsclient name="projects-client" options="rw,sync" target="136.159.***.0/24"/> <nfsexport name="projects-export"/> </resources> <service autostart="1" domain="Cluster Failover" name="cluster NAT"> <ip address="10.1.1.1" monitor_link="1"/> <script file="/cluster/scripts/cluster_nat" name="cluster NAT script"/> </service> <service autostart="1" domain="Cluster Failover" name="NFS people"> <ip address="136.159.***.***" monitor_link="1"/> <clusterfs ref="people"> <nfsexport ref="people-export"> <nfsclient ref="people-client"/> </nfsexport> </clusterfs> </service> <service autostart="1" domain="Cluster Failover" name="NFS projects"> <ip address="136.159.***.***" monitor_link="1"/> <clusterfs device="/dev/BIOCOMP/RT_testproject" force_unmount="1" fstype="gfs" mountpoint="/projects/RT_testproject" name="RT_testproject" options=""> <nfsexport ref="projects-export"> <nfsclient ref="projects-client"/> </nfsexport> </clusterfs> </service> </rm> </cluster> Am I doing something wrong here? I tried looking through /usr/share/cluster/clusterfs.sh to see where it is returning 1 from but I can't seem to be able to debug this issue on my own. Thoughts, Ideas, Suggestions? -- Ryan -- Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster