>>> below the cluster.conf file ... >>> >>> >>> <?xml version="1.0"?> >>> <cluster name="cluster" config_version="6"> >>> <!-- post_join_delay: number of seconds the daemon will wait before >>> fencing any victims after a node joins the domain >>> post_fail_delay: number of seconds the daemon will wait before >>> fencing any victims after a domain member fails >>> clean_start : prevent any startup fencing the daemon might do. >>> It indicates that the daemon should assume all nodes >>> are in a clean state to start. --> >>> <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/> >>> <clusternodes> >>> <clusternode name="reporter1.lab.intranet" votes="1" nodeid="1"> >>> <fence> >>> <!-- Handle fencing manually --> >>> <method name="human"> >>> <device name="human" nodename="reporter1.lab.intranet"/> >>> </method> >>> </fence> >>> </clusternode> >>> <clusternode name="reporter2.lab.intranet" votes="1" nodeid="2"> >>> <fence> >>> <!-- Handle fencing manually --> >>> <method name="human"> >>> <device name="human" nodename="reporter2.lab.intranet"/> >>> </method> >>> </fence> >>> </clusternode> >>> </clusternodes> >>> <!-- cman two nodes specification --> >>> <cman expected_votes="1" two_node="1"/> >>> <fencedevices> >>> <!-- Define manual fencing --> >>> <fencedevice name="human" agent="fence_manual"/> >>> </fencedevices> >>> <rm> >>> <failoverdomains> >>> <failoverdomain name="example_pri" nofailback="0" ordered="1" restricted="0"> >>> <failoverdomainnode name="reporter1.lab.intranet" priority="1"/> >>> <failoverdomainnode name="reporter2.lab.intranet" priority="2"/> >>> </failoverdomain> >>> </failoverdomains> >>> <resources> >>> <ip address="10.30.30.92" monitor_link="on" sleeptime="10"/> >>> <apache config_file="conf/httpd.conf" name="example_server" server_root="/etc/httpd" shutdown_wait="0"/> >>> </resources> >>> <service autostart="1" domain="example_pri" exclusive="0" name="example_apache" recovery="relocate"> >>> <ip ref="10.30.30.92"/> >>> <apache ref="example_server"/> >>> </service> >>> </rm> >>> </cluster> >>> >>> and this is the result I get on both servers ... >>> >>> [root@reporter1 ~]# clustat >>> Cluster Status for cluster @ Mon Feb 14 22:22:53 2011 >>> Member Status: Quorate >>> >>> Member Name ID Status >>> ------ ---- ---- ------ >>> reporter1.lab.intranet 1 Online, Local, rgmanager >>> reporter2.lab.intranet 2 Online, rgmanager >>> >>> Service Name Owner (Last) State >>> ------- ---- ----- ------ ----- >>> service:example_apache (none) stopped >>> >>> as you can see, everything is stopped or in other words nothing runs .. so my question are : > >Having a read through /var/log/messages for possible causes would be a >good start. > this is what I see in the /var/log/messages file ... Feb 16 07:36:54 reporter1 corosync[1250]: [MAIN ] Corosync Cluster Engine ('1.2.3'): started and ready to provide service. Feb 16 07:36:54 reporter1 corosync[1250]: [MAIN ] Corosync built-in features: nss rdma Feb 16 07:36:54 reporter1 corosync[1250]: [MAIN ] Successfully read config from /etc/cluster/cluster.conf Feb 16 07:36:54 reporter1 corosync[1250]: [MAIN ] Successfully parsed cman config Feb 16 07:36:54 reporter1 corosync[1250]: [TOTEM ] Initializing transport (UDP/IP). Feb 16 07:36:54 reporter1 corosync[1250]: [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0). Feb 16 07:36:55 reporter1 corosync[1250]: [TOTEM ] The network interface [10.30.30.90] is now up. Feb 16 07:36:55 reporter1 corosync[1250]: [QUORUM] Using quorum provider quorum_cman Feb 16 07:36:55 reporter1 corosync[1250]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1 Feb 16 07:36:55 reporter1 corosync[1250]: [CMAN ] CMAN 3.0.12 (built Aug 17 2010 14:08:49) started Feb 16 07:36:55 reporter1 corosync[1250]: [SERV ] Service engine loaded: corosync CMAN membership service 2.90 Feb 16 07:36:55 reporter1 corosync[1250]: [SERV ] Service engine loaded: openais checkpoint service B.01.01 Feb 16 07:36:55 reporter1 corosync[1250]: [SERV ] Service engine loaded: corosync extended virtual synchrony service Feb 16 07:36:55 reporter1 corosync[1250]: [SERV ] Service engine loaded: corosync configuration service Feb 16 07:36:55 reporter1 corosync[1250]: [SERV ] Service engine loaded: corosync cluster closed process group service v1.01 Feb 16 07:36:55 reporter1 corosync[1250]: [SERV ] Service engine loaded: corosync cluster config database access v1.01 Feb 16 07:36:55 reporter1 corosync[1250]: [SERV ] Service engine loaded: corosync profile loading service Feb 16 07:36:55 reporter1 corosync[1250]: [QUORUM] Using quorum provider quorum_cman Feb 16 07:36:55 reporter1 corosync[1250]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1 Feb 16 07:36:55 reporter1 corosync[1250]: [MAIN ] Compatibility mode set to whitetank. Using V1 and V2 of the synchronization engine. Feb 16 07:36:55 reporter1 corosync[1250]: [TOTEM ] A processor joined or left the membership and a new membership was formed. Feb 16 07:36:55 reporter1 corosync[1250]: [CMAN ] quorum regained, resuming activity Feb 16 07:36:55 reporter1 corosync[1250]: [QUORUM] This node is within the primary component and will provide service. Feb 16 07:36:55 reporter1 corosync[1250]: [QUORUM] Members[1]: 1 Feb 16 07:36:55 reporter1 corosync[1250]: [QUORUM] Members[1]: 1 Feb 16 07:36:55 reporter1 corosync[1250]: [CPG ] downlist received left_list: 0 Feb 16 07:36:55 reporter1 corosync[1250]: [CPG ] chosen downlist from node r(0) ip(10.30.30.90) Feb 16 07:36:55 reporter1 corosync[1250]: [MAIN ] Completed service synchronization, ready to provide service. Feb 16 07:36:56 reporter1 fenced[1302]: fenced 3.0.12 started Feb 16 07:36:57 reporter1 dlm_controld[1319]: dlm_controld 3.0.12 started Feb 16 07:36:57 reporter1 gfs_controld[1374]: gfs_controld 3.0.12 started Feb 16 07:37:03 reporter1 corosync[1250]: [TOTEM ] A processor joined or left the membership and a new membership was formed. Feb 16 07:37:03 reporter1 corosync[1250]: [QUORUM] Members[2]: 1 2 Feb 16 07:37:03 reporter1 corosync[1250]: [QUORUM] Members[2]: 1 2 Feb 16 07:37:03 reporter1 corosync[1250]: [CPG ] downlist received left_list: 0 Feb 16 07:37:03 reporter1 corosync[1250]: [CPG ] downlist received left_list: 0 Feb 16 07:37:03 reporter1 corosync[1250]: [CPG ] chosen downlist from node r(0) ip(10.30.30.90) >>> do I have to configure manually load balanced ip 10.30.30.92 as an alias ip on both sides or is it done automatically by redhat cluster ? > >RHCS will automatically assign the IP to an interface that is on the >same subnet. You most definitely shouldn't create the IP manually on any >of the nodes. > >>> I just made a simple try with apache but I do not find anywhere reference to the start/stop script for apache in the examples, is that normal ?? >>> do you have some best practice regarding this picture ?? > >I'm not familiar with the <apache> tag in cluster.conf, I usually >configure most things as init script resources. > >Gordan ----------------------------------------------------------------- ATTENTION: This e-mail is intended for the exclusive use of the recipient(s). This e-mail and its attachments, if any, contain confidential information and/or information protected by intellectual property rights or other rights. This e-mail does not constitute any commitment for ING Belgium except when expressly otherwise agreed in a written agreement between the intended recipient and ING Belgium. If you receive this message by mistake, please, notify the sender with the "reply" option and delete immediately this e-mail from your system, and destroy all copies of it. You may not, directly or indirectly, use this e-mail or any part of it if you are not the intended recipient. Messages and attachments are scanned for all viruses known. If this message contains password-protected attachments, the files have NOT been scanned for viruses by the ING mail domain. Always scan attachments before opening them. ----------------------------------------------------------------- ING Belgium SA/NV - Bank/Lender - Avenue Marnix 24, B-1000 Brussels, Belgium - Brussels RPM/RPR - VAT BE 0403.200.393 - BIC (SWIFT) : BBRUBEBB - Account: 310-9156027-89 (IBAN BE45 3109 1560 2789). An insurance broker, registered with the Banking, Finance and Insurance Commission under the code number 12381A. ING Belgique SA - Banque/Preteur, Avenue Marnix 24, B-1000 Bruxelles - RPM Bruxelles - TVA BE 0403 200 393 - BIC (SWIFT) : BBRUBEBB - Compte: 310-9156027-89 (IBAN: BE45 3109 1560 2789). Courtier d'assurances inscrit a la CBFA sous le numero 12381A. ING Belgie NV - Bank/Kredietgever - Marnixlaan 24, B-1000 Brussel - RPR Brussel - BTW BE 0403.200.393 - BIC (SWIFT) : BBRUBEBB - Rekening: 310-9156027-89 (IBAN: BE45 3109 1560 2789). Verzekeringsmakelaar ingeschreven bij de CBFA onder het nr. 12381A. ----------------------------------------------------------------- -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster