RE: Testing Failover - Failing in few cases

rhurst@xxxxxxxxxxxxxxxxx · Wed, 30 May 2007 11:03:09 -0400

We second that motion -- skip U4 altogether, go directly to U5.

On Wed, 2007-05-30 at 16:55 +0200, Hagmann, Michael wrote:

    Hi 

    First of all when you really have RHEL4 update4, then you should update to RHEL4 update5 befor you go into more testing.

    There are a lot of bugs in RHEL4 CS Update 4 !

    Mike

    From: linux-cluster-bounces@xxxxxxxxxx [mailto:linux-cluster-bounces@xxxxxxxxxx] On Behalf Of Satya Daragani

    Sent: Montag, 28. Mai 2007 15:12

    To: Linux-cluster@xxxxxxxxxx

    Subject:  Testing Failover - Failing in few cases

    Hi Linux-Cluster Team,

    Please help me in testing the failover with the RHEL Cluster Suite 4 with update 4. I am appending the details related to cluster nodes and configuration here. Kindly suggest me how to proceed further.

    IBM Lenovo Thinkcentre with AMD Opteron 64bit processor - Two nodes

    256 MB RAM

    One NIC

        Installed RHEL AS 4 Update 4 on both the nodes 

Configured NIC with IP range 192.168.1.x (node1 – 192.168.1.1 , node2 – 192.168.1.2) 

Configured /etc/hosts. 

Installed the RHEL cluster suite 4 update 4 on both nodes. 

Added both the nodes in the cluster manager with one quorum vote 

No fence devices configured (chkconfig --del fenced) 

Restricted & ordered by priority (node1 – 1, node -2) level failover domain configured. 

Shared IP address (192.168.1.5) resource is configured and enabled the monitor link option.  

Created a service with the name httpd and configured the following 

            Checked the Autostart this service 

Selected the failover domain configured in the previous steps. 

Selected the Relocate as the recovery policy 

Added the shared resource (IP created in the above steps), under this shared resource added the private resource script(/etc/rc.d/init.d/httpd). 

    Checking the failover:

    1st case

    After configuring the above, now node1 is the primary node for the httpd service.

    If I restart the node1 the service is failed over to the node2, and once the node1 comes up again the service is failing over to the node1 (as the priority is configured) 

    2nd case

    Currently node1 is running the httpd service, if I down the network interface (ifconfig eth0 down), the httpd service is failing over to the node2. 

    Then if I up the interface (ifconfig eth0 up) on node1, the service is not failovering to the node1 and in the /var/log/messages it is saying "unable to contact the cluster infrastructure". Need your help here

    If I restart the cluster services on the node1 again the service is getting started on the node1.

    3rd case

    Currently node1 is running the HTTPd service, if I remove the powercord (I mean the improper shutdown), the service is going to the recovery mode and not getting started on the node2. Need your help here.

    4th case

    Currently node1 is running the httpd service, if I stop or killall the httpd service (service httpd stop) failover is not happening. Need your help here.

    -- 

    Thanx

    Satya Daragani

    satya.daragani@xxxxxxxxx 

    +91 98850 58366 

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

Robert Hurst, Sr. Caché Administrator

Beth Israel Deaconess Medical Center

1135 Tremont Street, REN-7

Boston, Massachusetts   02120-2140

617-754-8754 ∙ Fax: 617-754-8730 ∙ Cell: 401-787-3154

Any technology distinguishable from magic is insufficiently advanced.

Attachment:
smime.p7s

Description: S/MIME cryptographic signature
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster