Re: error clusvcadm

emmanuel segura <emi2fast@xxxxxxxxx> · Mon, 13 May 2013 09:47:43 +0200



Hello

If you would like see why your service doens't start, you should use "rg_test test /etc/cluster/cluster.conf start service HA_MGMT"


2013/5/13 Delphine Ramalingom <delphine.ramalingom@xxxxxxxxxxxxxxx>


    Hi,

      
      This is the cluster.conf :

      
      [root@titan0 11:29:14 ~]# cat /etc/cluster/cluster.conf

      <?xml version="1.0" ?>

      <cluster config_version="7" name="HA_MGMT">

              <fence_daemon clean_start="1" post_fail_delay="0"
      post_join_delay="60"/>

              <clusternodes>

                      <clusternode name="titan0"  nodeid="1"
      votes="1">

                              <fence>

                                      <method name="1">

                                              <device
      name="titan0fence" option="reboot"/>

                                      </method>

                              </fence>

                      </clusternode>

                      <clusternode name="titan1" nodeid="2"
      votes="1">

                              <fence>

                                      <method name="1">

                                              <device
      name="titan1fence" option="reboot"/>

                                      </method>

                              </fence>

                      </clusternode>

              </clusternodes>

              <cman  cluster_id="0" expected_votes="1"
      two_node="1"/>

              <fencedevices>

                      <fencedevice agent="fence_ipmilan"
      ipaddr="172.17.0.101" login="administrator" name="titan0fence"
      passwd="administrator"/>

                      <fencedevice agent="fence_ipmilan"
      ipaddr="172.17.0.102" login="administrator" name="titan1fence"
      passwd="administrator"/>

              </fencedevices>

              <rm>

                      <failoverdomains>

                              <failoverdomain name="titan0_heuristic"
      ordered="0" restricted="1"> 

                                      <failoverdomainnode
      name="titan0" priority="1"/> 

                              </failoverdomain> 

                              <failoverdomain name="titan1_heuristic"
      ordered="0" restricted="1"> 

                                      <failoverdomainnode
      name="titan1" priority="1"/> 

                              </failoverdomain>

                              <failoverdomain name="MgmtNodes"
      ordered="0" restricted="0">

                                      <failoverdomainnode
      name="titan0" priority="1"/>

                                      <failoverdomainnode
      name="titan1" priority="2"/>

                              </failoverdomain>

                  <failoverdomain name="NFSHA" ordered="0"
      restricted="0">

                      <failoverdomainnode name="titan0"
      priority="2"/>

                      <failoverdomainnode name="titan1"
      priority="1"/>

                  </failoverdomain>

                      </failoverdomains>

                  <service domain="titan0_heuristic"
      name="ha_titan0_check" autostart="1" checkinterval="10"> 

                          <script file="/usr/sbin/ha_titan0_check"
      name="ha_titan0_check"/> 

                  </service> 

                  <service domain="titan1_heuristic"
      name="ha_titan1_check" autostart="1" checkinterval="10"> 

                          <script file="/usr/sbin/ha_titan1_check"
      name="ha_titan1_check"/> 

                  </service>

                      <service domain="MgmtNodes" name="HA_MGMT"
      autostart="0" recovery="relocate">

                  <!-- ip addresses lines mgmt -->

                                      <ip address="172.17.0.99/16"
      monitor_link="1"/>

                                      <ip address="10.90.0.99/24"
      monitor_link="1"/>

                  <!-- devices lines mgmt -->

                             <fs device="LABEL=postfix"
      mountpoint="/var/spool/postfix" force_unmount="1" fstype="ext3"
      name="mgmtha5" options=""/>

                             <fs device="LABEL=bigimage"
      mountpoint="/var/lib/systemimager" force_unmount="1" fstype="ext3"
      name="mgmtha4" options=""/>

                             <clusterfs device="LABEL=HA_MGMT:conman"
      mountpoint="/var/log/conman" force_unmount="0" fstype="gfs2"
      name="mgmtha3" options=""/>

                             <clusterfs
      device="LABEL=HA_MGMT:ganglia" mountpoint="/var/lib/ganglia/rrds"
      force_unmount="0" fstype="gfs2" name="mgmtha2" options=""/>

                             <clusterfs device="LABEL=HA_MGMT:syslog"
      mountpoint="/var/log/HOSTS" force_unmount="0" fstype="gfs2"
      name="mgmtha1" options=""/>

                             <clusterfs device="LABEL=HA_MGMT:cdb"
      mountpoint="/var/lib/pgsql/data" force_unmount="0" fstype="gfs2"
      name="mgmtha0" options=""/>

                              <script file="/usr/sbin/haservices"
      name="haservices"/>

                      </service>

              <service domain="NFSHA" name="HA_NFS" autostart="0"
      checkinterval="60">

                  <!-- ip addresses lines nfs -->

                                      <ip address="10.31.0.99/16"
      monitor_link="1"/>

                                      <ip address="10.90.0.88/24"
      monitor_link="1"/>

                                      <ip address="172.17.0.88/16"
      monitor_link="1"/>

                  <!-- devices lines nfs -->

                             <fs device="LABEL=PROGS"
      mountpoint="/programs" force_unmount="1" fstype="ext3"
      name="nfsha4" options=""/>

                             <fs device="LABEL=WRKTMP"
      mountpoint="/worktmp" force_unmount="1" fstype="ext3"
      name="nfsha3" options=""/>

                             <fs device="LABEL=LABOS"
      mountpoint="/labos" force_unmount="1" fstype="xfs" name="nfsha2"
      options="ikeep"/>

                             <fs device="LABEL=OPTINTEL"
      mountpoint="/opt/intel" force_unmount="1" fstype="ext3"
      name="nfsha1" options=""/>

                             <fs device="LABEL=HOMENFS"
      mountpoint="/home_nfs" force_unmount="1" fstype="ext3"
      name="nfsha0" options=""/>

                  <script file="/etc/init.d/nfs"
      name="nfs_service"/>

              </service>

              </rm>

          <totem token="21000" />

      </cluster>

      <!-- !!!!! DON'T REMOVE OR CHANGE ANYTHING IN PARAMETERS
      SECTION BELOW 

      node_name=titan0

      node_ipmi_ipaddr=172.17.0.101

      node_hwmanager_login=administrator

      node_hwmanager_passwd=administrator

      ipaddr1_for_heuristics=172.17.0.200

      node_ha_name=titan1

      node_ha_ipmi_ipaddr=172.17.0.102

      node_ha_hwmanager_login=administrator

      node_ha_hwmanager_passwd=administrator

      ipaddr2_for_heuristics=172.17.0.200

      mngt_virt_ipaddr_for_heuristics=not used on this type of node

      END OF SECTION !!!!! --> 

      
      The var/log/messages is too long and have some messages repeated :

      May 13 11:30:33 s_sys@titan0 snmpd[4584]: Connection from UDP:
      [10.40.20.30]:39198

      May 13 11:30:33 s_sys@titan0 snmpd[4584]: Connection from UDP:
      [10.40.20.30]:39198

      May 13 11:30:33 s_sys@titan0 snmpd[4584]: Connection from UDP:
      [10.40.20.30]:39198

      May 13 11:30:33 s_sys@titan0 snmpd[4584]: Connection from UDP:
      [10.40.20.30]:39198

      May 13 11:30:33 s_sys@titan0 snmpd[4584]: Connection from UDP:
      [10.40.20.30]:39198

      May 13 11:30:33 s_sys@titan0 snmpd[4584]: Connection from UDP:
      [10.40.20.30]:39198

      May 13 11:30:33 s_sys@titan0 snmpd[4584]: Connection from UDP:
      [10.40.20.30]:39198

      May 13 11:30:34 s_sys@titan0 snmpd[4584]: Connection from UDP:
      [10.40.20.30]:53030

      May 13 11:30:34 s_sys@titan0 snmpd[4584]: Received SNMP packet(s)
      from UDP: [10.40.20.30]:53030

      May 13 11:30:34 s_sys@titan0 snmpd[4584]: Connection from UDP:
      [10.40.20.30]:41083

      May 13 11:30:34 s_sys@titan0 snmpd[4584]: Received SNMP packet(s)
      from UDP: [10.40.20.30]:41083

      
      Regards

      Delphine

      
      Le 13/05/13 10:37, Rajveer Singh a écrit :

    
        Hi Delphine,

          It seems there is some filesystem crash. Please share your
          /var/log/messages and /etc/cluster/cluster.conf file to help
          you futher.

          
        Regards,

        Rajveer Singh

      
        On Mon, May 13, 2013 at 11:58 AM,
          Delphine Ramalingom <delphine.ramalingom@xxxxxxxxxxxxxxx>
          wrote:

          Hello,

            
            I have a problem and I need some help.

            
            Our cluster linux have been stopped for maintenance in the
            room server butr, an error was occured during the stopping
            procedure :

            Local machine disabling service:HA_MGMT...Failure

            
            The cluster was electrically stopped. But since the restart,
            I don't succed to restart services with command clussvcadm.

            I have this message :

            
            clusvcadm -e HA_MGMT

            Local machine trying to enable service:HA_MGMT...Aborted;
            service failed

            and

            <err>    startFilesystem: Could not match
            LABEL=postfix with a real device

            
            Do you have a solution for me ?

            
            Thanks a lot in advance.

            
            Regards

                Delphine

                
                -- 

                Linux-cluster mailing list

                Linux-cluster@xxxxxxxxxx

                https://www.redhat.com/mailman/listinfo/linux-cluster

              
--

Linux-cluster mailing list

Linux-cluster@xxxxxxxxxx

https://www.redhat.com/mailman/listinfo/linux-cluster


-- 
esta es mi vida e me la vivo hasta que dios quiera

-- 
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster