Ryan Golhar wrote:
I'm running RHEL 5.3 64-bit. So far, I only want to see that the
cluster can run. I'll worry about getting GFS after I'm confident this
works.
I've got three nodes: pico, vail, and whistler. They each have two NIC
cards, one that provides a public IP address, and another that provides
private communications. All cluster traffic will go over the private
network, 192.168.20.0.
I've installed only the following components:
system-config-cluster-1.0.52-1.1, cman-2.0.98-1, and rgmanager-2.0.38-2.
I've created my cluster.conf file to include these three nodees and
fence them using a brocade fibre switch (for GFS).
When I start the cluster services on all 3 nodes using the manually
method of:
/sbin/ccsd; /usr/sbin/cman_tool join
The nodes successfully form a cluster. I am able to leave the cluster
and kill ccsd as well.
If I try to start the cman service I see:
[root@pico cluster]# /sbin/service cman start
Starting cluster:
Loading modules... done
Mounting configfs... done
Starting ccsd... done
Starting cman... done
Starting daemons... done
Starting fencing...
And it just hangs. I know my fencing is set up correctly because I've
had nodes fence other nodes before (when I was trying with 6 members).
If I let it sit for long enough sometimes it finishes successfully. I'm
not sure what its doing because fence_tool is called and its a binary...
Ryan,
Anything suspicious in the log when it hangs at fencing ?
Could you show your cluster.conf ?
Vu
Ryan
Gordan Bobic wrote:
What distro are you using? I've found that:
1) Distros other than RHEL/CentOS can be quirky when it comes to using
RHCS. I've even run into problems on Fedora more than once (not to
mention
that FC hasn't shipped GFS1 since FC5 and GFS2 hasn't been deemed
production stable until last month - and we're up to FC10 now).
2) Starting RHCS components using anything except the intended init
scripts
tends to cause problems.
3) Source of 99% of problems in the rest of the cases (i.e. not
covered by
1) and 2) above) is incorrectly configured fencing.
Does your setup fall under either of the first two categories?
Have you verified beyond doubt that your fencing is configured correctly
and that the fencing script gets verification upon success?
Gordan
On Tue, 14 Apr 2009 12:17:44 -0400, Ryan Golhar <golharam@xxxxxxxxx>
wrote:
Hi all,
Is redhat cluster suite really reliable? I've been having so much
trouble getting a cluster up and running, I'm beginning to second
guess my decision to use this software stack.
I have 3 nodes (eventually 10) running and set up. The fencing
method is by a brocade fibre switch. The ultimate goal of this
cluster is to shared a SAN connected by fibre.
I've installed just the bare minimum (before even getting to GFS) to
test the cluster software. Just starting cman cluster services fails
on two of the nodes.
Even when I try to reboot the nodes, I can't because the whole system
hangs on various processes that don't ever shut down. I have to
physically reboot these boxes.
The logs fill up with errors about not being able to connect to cman,
etc.
I've been at it for awhile now and am not sure this is the best route
anymore.
Ryan
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster