Dear All, After I successfully solved the ha-lvm/clvmd issue, during the startup of the SAP group I experience strage behavior of the cluster. Before starting the servicegroup it tries to start/stop the SAP instance and mounting the disks (however the service is still not starting up)... After this was unsuccessful, it starts the service itself, which starts all the resources without problem in the right order. What can be the reason of this trial of starting some resources before starting the whole service (and before the node sees itself up ) Can you help me to identify the error why the resource dependecies does not work all the time? Thanks in advance, Krisztian Aug 11 17:27:16 linuxsap1 rgmanager[9801]: I am node #1 Aug 11 17:27:16 linuxsap1 rgmanager[9801]: Resource Group Manager Starting Aug 11 17:27:16 linuxsap1 rgmanager[9801]: Loading Service Data Aug 11 17:27:17 linuxsap1 rgmanager[9801]: Initializing Services Aug 11 17:27:18 linuxsap1 rgmanager[10884]: [SAPInstance] sapstartsrv is not running for instance PRD-DVEBMGS00, it will be started now Aug 11 17:27:18 linuxsap1 rgmanager[10911]: [SAPInstance] sapstartsrv for instance PRD-DVEBMGS00 could not be started! Aug 11 17:27:18 linuxsap1 rgmanager[10934]: [SAPInstance] SAP Instance PRD-DVEBMGS00 stop failed: Aug 11 17:27:18 linuxsap1 rgmanager[10956]: [SAPInstance] Attribute POST_STOP_USEREXIT is set to /usr/sap/PRD/sapsrvstop.sh, but this file is not executable Aug 11 17:27:18 linuxsap1 rgmanager[9801]: stop on SAPInstance "PRD_DVEBMGS00_sapprd" returned 1 (generic error) Aug 11 17:27:18 linuxsap1 rgmanager[10997]: [SAPDatabase] Cannot find startdb,stopdb and R3trans executable, please set DIR_EXECUTABLE parameter! Aug 11 17:27:18 linuxsap1 rgmanager[9801]: stop on SAPDatabase "PRD" returned 7 (unspecified) Aug 11 17:27:18 linuxsap1 rgmanager[11035]: [ip] 10.100.100.104 is not configured Aug 11 17:27:18 linuxsap1 rgmanager[11072]: [fs] stop: Could not match /dev/vg_PRD_trans/lv_PRD_trans with a real device Aug 11 17:27:18 linuxsap1 rgmanager[9801]: stop on fs "PRD_trans" returned 2 (invalid argument(s)) Aug 11 17:27:19 linuxsap1 rgmanager[11150]: [fs] stop: Could not match /dev/vg_PRD_usrsap/lv_PRD_usrsap with a real device Aug 11 17:27:19 linuxsap1 rgmanager[9801]: stop on fs "PRD_usrsap" returned 2 (invalid argument(s)) Aug 11 17:27:21 linuxsap1 rgmanager[11227]: [fs] stop: Could not match /dev/vg_PRD_sapmnt/lv_PRD_sapmnt with a real device Aug 11 17:27:21 linuxsap1 rgmanager[9801]: stop on fs "PRD_sapmnt" returned 2 (invalid argument(s)) Aug 11 17:27:22 linuxsap1 rgmanager[9801]: stop on fs "PRD_sapdata1" returned 2 (invalid argument(s)) Aug 11 17:27:22 linuxsap1 rgmanager[11305]: [fs] stop: Could not match /dev/vg_PRD_oracle/lv_PRD_sapdata1 with a real device Aug 11 17:27:22 linuxsap1 rgmanager[11342]: [fs] stop: Could not match /dev/vg_PRD_oracle/lv_PRD_oraarch with a real device Aug 11 17:27:22 linuxsap1 rgmanager[9801]: stop on fs "PRD_oraarch" returned 2 (invalid argument(s)) Aug 11 17:27:22 linuxsap1 rgmanager[11379]: [fs] stop: Could not match /dev/vg_PRD_oracle/lv_PRD_oralog1 with a real device Aug 11 17:27:22 linuxsap1 rgmanager[9801]: stop on fs "PRD_oralog1" returned 2 (invalid argument(s)) Aug 11 17:27:22 linuxsap1 rgmanager[11416]: [fs] stop: Could not match /dev/vg_PRD_oracle/lv_PRD_oralog2 with a real device Aug 11 17:27:22 linuxsap1 rgmanager[9801]: stop on fs "PRD_oralog2" returned 2 (invalid argument(s)) Aug 11 17:27:22 linuxsap1 rgmanager[11453]: [fs] stop: Could not match /dev/vg_PRD_oracle/lv_PRD_orabin with a real device Aug 11 17:27:22 linuxsap1 rgmanager[9801]: stop on fs "PRD_orabin" returned 2 (invalid argument(s)) Aug 11 17:27:24 linuxsap1 rgmanager[9801]: Services Initialized Aug 11 17:27:24 linuxsap1 rgmanager[9801]: State change: Local UP Aug 11 17:27:24 linuxsap1 rgmanager[9801]: Starting stopped service service:SAP-PRD Aug 11 17:27:25 linuxsap1 rgmanager[11551]: [lvm] Starting volume group, vg_PRD_oracle Aug 11 17:27:25 linuxsap1 rgmanager[11580]: [lvm] I can claim this volume group Aug 11 17:27:25 linuxsap1 rgmanager[11619]: [lvm] New tag "linuxsap1-priv" added to vg_PRD_oracle Aug 11 17:27:26 linuxsap1 rgmanager[11803]: [fs] mounting /dev/dm-13 on /oracle/PRD Aug 11 17:27:26 linuxsap1 rgmanager[11825]: [fs] mount -t ext4 /dev/dm-13 /oracle/PRD Aug 11 17:27:26 linuxsap1 rgmanager[11985]: [fs] mounting /dev/dm-15 on /oracle/PRD/origlogB Aug 11 17:27:26 linuxsap1 rgmanager[12007]: [fs] mount -t ext4 /dev/dm-15 /oracle/PRD/origlogB Aug 11 17:27:26 linuxsap1 kernel: EXT4-fs (dm-15): warning: maximal mount count reached, running e2fsck is recommended Aug 11 17:27:26 linuxsap1 kernel: EXT4-fs (dm-15): mounted filesystem with ordered data mode. Opts: Aug 11 17:27:26 linuxsap1 rgmanager[12200]: [fs] mounting /dev/dm-14 on /oracle/PRD/origlogA Aug 11 17:27:26 linuxsap1 rgmanager[12222]: [fs] mount -t ext4 /dev/dm-14 /oracle/PRD/origlogA Aug 11 17:27:27 linuxsap1 kernel: EXT4-fs (dm-14): mounted filesystem with ordered data mode. Opts: Aug 11 17:27:27 linuxsap1 rgmanager[12391]: [fs] mounting /dev/dm-16 on /oracle/PRD/oraarch Aug 11 17:27:27 linuxsap1 rgmanager[12413]: [fs] mount -t ext4 /dev/dm-16 /oracle/PRD/oraarch Aug 11 17:27:27 linuxsap1 kernel: EXT4-fs (dm-16): mounted filesystem with ordered data mode. Opts: Aug 11 17:27:27 linuxsap1 rgmanager[12589]: [fs] mounting /dev/dm-17 on /oracle/PRD/sapdata1 Aug 11 17:27:27 linuxsap1 rgmanager[12611]: [fs] mount -t ext4 /dev/dm-17 /oracle/PRD/sapdata1 Aug 11 17:27:27 linuxsap1 kernel: EXT4-fs (dm-17): mounted filesystem with ordered data mode. Opts: Aug 11 17:27:28 linuxsap1 rgmanager[12681]: [lvm] Starting volume group, vg_PRD_sapmnt Aug 11 17:27:28 linuxsap1 rgmanager[12710]: [lvm] I can claim this volume group Aug 11 17:27:28 linuxsap1 rgmanager[12749]: [lvm] New tag "linuxsap1-priv" added to vg_PRD_sapmnt Aug 11 17:27:29 linuxsap1 rgmanager[12920]: [fs] mounting /dev/dm-18 on /sapmnt/PRD Aug 11 17:27:29 linuxsap1 rgmanager[12942]: [fs] mount -t ext4 /dev/dm-18 /sapmnt/PRD Aug 11 17:27:29 linuxsap1 kernel: EXT4-fs (dm-18): mounted filesystem with ordered data mode. Opts: Aug 11 17:27:30 linuxsap1 rgmanager[13018]: [lvm] Starting volume group, vg_PRD_usrsap Aug 11 17:27:30 linuxsap1 rgmanager[13047]: [lvm] I can claim this volume group Aug 11 17:27:30 linuxsap1 rgmanager[13094]: [lvm] New tag "linuxsap1-priv" added to vg_PRD_usrsap Aug 11 17:27:31 linuxsap1 rgmanager[13298]: [fs] mounting /dev/dm-19 on /usr/sap/PRD Aug 11 17:27:31 linuxsap1 rgmanager[13320]: [fs] mount -t ext4 /dev/dm-19 /usr/sap/PRD Aug 11 17:27:31 linuxsap1 kernel: EXT4-fs (dm-19): warning: maximal mount count reached, running e2fsck is recommended Aug 11 17:27:31 linuxsap1 kernel: EXT4-fs (dm-19): mounted filesystem with ordered data mode. Opts: Aug 11 17:27:32 linuxsap1 rgmanager[13391]: [lvm] Starting volume group, vg_PRD_trans Aug 11 17:27:32 linuxsap1 rgmanager[13422]: [lvm] I can claim this volume group Aug 11 17:27:32 linuxsap1 rgmanager[13461]: [lvm] New tag "linuxsap1-priv" added to vg_PRD_trans Aug 11 17:27:33 linuxsap1 rgmanager[13658]: [fs] mounting /dev/dm-33 on /usr/sap/transERP Aug 11 17:27:33 linuxsap1 rgmanager[13681]: [fs] mount -t ext4 /dev/dm-33 /usr/sap/transERP Aug 11 17:27:33 linuxsap1 kernel: EXT4-fs (dm-33): mounted filesystem with ordered data mode. Opts: Aug 11 17:27:33 linuxsap1 kernel: SELinux: initialized (dev dm-33, type ext4), uses xattr Aug 11 17:27:33 linuxsap1 rgmanager[13761]: [ip] Link for publicteam1: Detected Aug 11 17:27:33 linuxsap1 rgmanager[13783]: [ip] Adding IPv4 address 10.100.100.104/16 to publicteam1 Aug 11 17:27:33 linuxsap1 rgmanager[13805]: [ip] Pinging addr 10.100.100.104 from dev publicteam1 Aug 11 17:27:35 linuxsap1 rgmanager[13832]: [ip] Sending gratuitous ARP: 10.100.100.104 d0:67:e5:ea:0f:a0 brd ff:ff:ff:ff:ff:ff Aug 11 17:27:36 linuxsap1 su: pam_unix(su-l:session): session opened for user oraprd by (uid=0) Aug 11 17:27:37 linuxsap1 su: pam_unix(su-l:session): session closed for user oraprd Aug 11 17:27:38 linuxsap1 rgmanager[14005]: [SAPDatabase] Oracle Listener LIST_PRD started: Warning: no access to tty (Bad file descriptor). Aug 11 17:27:38 linuxsap1 Thus no job control in this s Aug 11 17:27:38 linuxsap1 su: pam_unix(su-l:session): session opened for user prdadm by (uid=0) Aug 11 17:27:52 linuxsap1 su: pam_unix(su-l:session): session closed for user prdadm Aug 11 17:27:52 linuxsap1 rgmanager[14275]: [SAPDatabase] SAP database PRD started: Trying to start PRD database ... Aug 11 17:27:52 linuxsap1 Log file: /home/prdadm/startdb.log Aug 11 17:27:52 linuxsap1 PRD database start Aug 11 17:27:52 linuxsap1 rgmanager[14333]: [SAPInstance] sapstartsrv is not running for instance PRD-DVEBMGS00, it will be started now Aug 11 17:27:53 linuxsap1 SAPPRD_00[14507]: SAP Service SAPPRD_00 successfully started. Aug 11 17:27:55 linuxsap1 rgmanager[14539]: [SAPInstance] sapstartsrv for instance PRD-DVEBMGS00 was restarted ! Aug 11 17:27:55 linuxsap1 rgmanager[14702]: [SAPInstance] Starting SAP Instance PRD-DVEBMGS00: Aug 11 17:27:55 linuxsap1 11.08.2012 17:27:55 Aug 11 17:27:55 linuxsap1 Start Aug 11 17:27:55 linuxsap1 OK Aug 11 17:28:15 linuxsap1 rgmanager[15169]: [SAPInstance] SAP Instance PRD-DVEBMGS00 started: Aug 11 17:28:15 linuxsap1 11.08.2012 17:28:15 Aug 11 17:28:15 linuxsap1 WaitforStarted Aug 11 17:28:15 linuxsap1 OK Aug 11 17:28:15 linuxsap1 rgmanager[9801]: Service service:SAP-PRD started” The cluster.conf is the next <?xml version="1.0"?> <cluster config_version="167" name="linuxsap"> <clusternodes> <clusternode name="linuxsap1-priv" nodeid="1"> <fence> <method name="scsi"> <device key="1" name="scsi_dev"/> </method> </fence> <unfence> <device action="on" key="1" name="scsi_dev"/> </unfence> </clusternode> <clusternode name="linuxsap2-priv" nodeid="2"> <fence> <method name="scsi"> <device key="2" name="scsi_dev"/> </method> </fence> <unfence> <device action="on" key="2" name="scsi_dev"/> </unfence> </clusternode> </clusternodes> <cman expected_votes="3" transport="udpu"/> <rm> <failoverdomains> <failoverdomain name="FOD-SAP" nofailback="1" ordered="1" restricted="0"> <failoverdomainnode name="linuxsap1-priv" priority="1"/> <failoverdomainnode name="linuxsap2-priv" priority="2"/> </failoverdomain> <failoverdomain name="FOD-Oracle" nofailback="1" ordered="1" restricted="0"> <failoverdomainnode name="linuxsap1-priv" priority="2"/> <failoverdomainnode name="linuxsap2-priv" priority="1"/> </failoverdomain> <failoverdomain name="FOD-LinuxSap1" nofailback="0" ordered="0" restricted="1"> <failoverdomainnode name="linuxsap1-priv"/> </failoverdomain> <failoverdomain name="FOD-LinuxSap2" nofailback="0" ordered="0" restricted="1"> <failoverdomainnode name="linuxsap2-priv"/> </failoverdomain> </failoverdomains> <resources> <lvm name="vg_PRD_oracle" vg_name="vg_PRD_oracle"/> <lvm name="vg_PRD_sapmnt" vg_name="vg_PRD_sapmnt"/> <lvm name="vg_PRD_usrsap" vg_name="vg_PRD_usrsap"/> <lvm name="vg_PRD_trans" vg_name="vg_PRD_trans"/> <fs device="/dev/vg_PRD_oracle/lv_PRD_orabin" force_unmount="1" fstype="ext4" mountpoint="/oracle/PRD" name="PRD_orabin"/> <fs device="/dev/vg_PRD_oracle/lv_PRD_oralog1" force_unmount="1" fstype="ext4" mountpoint="/oracle/PRD/origlogA" name="PRD_oralog1"/> <fs device="/dev/vg_PRD_oracle/lv_PRD_oralog2" force_unmount="1" fstype="ext4" mountpoint="/oracle/PRD/origlogB" name="PRD_oralog2"/> <fs device="/dev/vg_PRD_oracle/lv_PRD_oraarch" force_unmount="1" fstype="ext4" mountpoint="/oracle/PRD/oraarch" name="PRD_oraarch"/> <fs device="/dev/vg_PRD_oracle/lv_PRD_sapdata1" force_unmount="1" fstype="ext4" mountpoint="/oracle/PRD/sapdata1" name="PRD_sapdata1"/> <fs device="/dev/vg_PRD_sapmnt/lv_PRD_sapmnt" force_unmount="1" fstype="ext4" mountpoint="/sapmnt/PRD" name="PRD_sapmnt"/> <fs device="/dev/vg_PRD_usrsap/lv_PRD_usrsap" force_unmount="1" fstype="ext4" mountpoint="/usr/sap/PRD" name="PRD_usrsap"/> <fs device="/dev/vg_PRD_trans/lv_PRD_trans" force_unmount="1" fstype="ext4" mountpoint="/usr/sap/transERP" name="PRD_trans"/> <ip address="10.100.100.104" monitor_link="on" sleeptime="10"/> <SAPInstance DIR_EXECUTABLE="/sapmnt/PRD/exe" DIR_PROFILE="/sapmnt/PRD/profile" InstanceName="PRD_DVEBMGS00_sapprd" POST_STOP_USEREXIT="/usr/sap/PRD/sapsrvstop.sh" START_PROFILE="/sapmnt/PRD/profile/START_DVEBMGS00_sapprd" START_WAITTIME="60"/> <SAPDatabase DBTYPE="ORA" DIR_EXECUTABLE="/sapmnt/PRD/exe" NETSERVICENAME="LIST_PRD" POST_STOP_USEREXIT="/usr/sap/PRD/sapsrvstop.sh" SID="PRD"/> <fs device="/dev/vg_teszt_10GB/lv_teszt_10GB" force_unmount="1" fsid="1886" fstype="ext4" mountpoint="/teszt" name="Teszt_10GB"/> <lvm name="vg_teszt_10GB" vg_name="vg_teszt_10GB"/> </resources> <service domain="FOD-SAP" name="SAP-PRD" recovery="relocate"> <lvm ref="vg_PRD_oracle"> <fs ref="PRD_orabin"> <fs ref="PRD_oralog2"/> <fs ref="PRD_oralog1"/> <fs ref="PRD_oraarch"/> <fs ref="PRD_sapdata1"/> </fs> </lvm> <lvm ref="vg_PRD_sapmnt"> <fs ref="PRD_sapmnt"/> </lvm> <lvm ref="vg_PRD_usrsap"> <fs ref="PRD_usrsap"/> </lvm> <lvm ref="vg_PRD_trans"> <fs ref="PRD_trans"/> </lvm> <ip ref="10.100.100.104"/> <SAPDatabase ref="PRD"> <SAPInstance ref="PRD_DVEBMGS00_sapprd"/> </SAPDatabase> </service> </rm> <dlm enable_deadlk="1" enable_quorum="1"/> <quorumd label="qdisk_dev"/> <fencedevices> <fencedevice agent="fence_scsi" aptpl="1" devices="/dev/mapper/36006016057a01e006226605213c4e111,/dev/mapper/36006016057a01e0080664e5d9fa4e111,/dev/mapper/36006016057a01e00c88c499a9ea4e111,/dev/mapper/36006016057a01e00ecf4bd78beafe111" logfile="/var/log/cluster/fence_scsi.log" name="scsi_dev"/> </fencedevices> </cluster>
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature
-- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster