Hi all,
I am trying to set up a splunk cluster service on two RHEL5.5 hosts (fully
updated). My problems becomes when I trying to setup this service under rgmanager:
script ever fails. If I launch the script manually, all works as expected. If I test
the service using rg_test comand, all works ok as expected.
This is the error when rgmanager tries to launch the service:
Jan 10 17:50:55 lorien clurgmgrd[25394]: <notice> Starting disabled service
service:siemmgmt-svc
Jan 10 17:50:55 lorien clurgmgrd: [25394]: <debug> Link for eth0: Detected
Jan 10 17:50:55 lorien clurgmgrd: [25394]: <info> Adding IPv4 address
172.25.70.22/28 to eth0
Jan 10 17:50:55 lorien clurgmgrd: [25394]: <debug> Pinging addr 172.25.70.22 from
dev eth0
Jan 10 17:50:57 lorien clurgmgrd: [25394]: <debug> Sending gratuitous ARP:
172.25.70.22 00:50:56:14:5a:1e brd ff:ff:ff:ff:ff:ff
Jan 10 17:50:58 lorien clurgmgrd: [25394]: <warning> Unknown file system type 'ext4'
for device /dev/inasvol/splunkvol. Assuming fsck is required.
Jan 10 17:50:58 lorien clurgmgrd: [25394]: <debug> Running fsck on
/dev/inasvol/splunkvol
Jan 10 17:50:58 lorien clurgmgrd: [25394]: <info> mounting /dev/inasvol/splunkvol on
/data/services/siem/splunk
Jan 10 17:50:58 lorien clurgmgrd: [25394]: <debug> mount -t ext4 -o rw
/dev/inasvol/splunkvol /data/services/siem/splunk
Jan 10 17:50:58 lorien clurgmgrd: [25394]: <info> Executing
/data/config/etc/init.d/splunk-cluster start
Jan 10 17:50:58 lorien clurgmgrd: [25394]: <err> script:splunk-cluster: start of
/data/config/etc/init.d/splunk-cluster failed (returned 1)
Jan 10 17:50:58 lorien clurgmgrd[25394]: <notice> start on script "splunk-cluster"
returned 1 (generic error)
Jan 10 17:50:58 lorien clurgmgrd[25394]: <warning> #68: Failed to start
service:siemmgmt-svc; return value: 1
Jan 10 17:50:58 lorien clurgmgrd[25394]: <debug> Stopping failed service
service:siemmgmt-svc
Jan 10 17:50:58 lorien clurgmgrd[25394]: <notice> Stopping service service:siemmgmt-svc
Jan 10 17:50:58 lorien clurgmgrd: [25394]: <info> Executing
/data/config/etc/init.d/splunk-cluster stop
Jan 10 17:50:58 lorien clurgmgrd: [25394]: <err> script:splunk-cluster: stop of
/data/config/etc/init.d/splunk-cluster failed (returned 1)
Jan 10 17:50:58 lorien clurgmgrd[25394]: <notice> stop on script "splunk-cluster"
returned 1 (generic error)
Jan 10 17:50:59 lorien clurgmgrd: [25394]: <info> unmounting /data/services/siem/splunk
Jan 10 17:50:59 lorien clurgmgrd: [25394]: <info> Removing IPv4 address
172.25.70.22/28 from eth0
Jan 10 17:51:09 lorien clurgmgrd[25394]: <crit> #12: RG service:siemmgmt-svc failed
to stop; intervention required
Jan 10 17:51:09 lorien clurgmgrd[25394]: <notice> Service service:siemmgmt-svc is failed
Jan 10 17:51:09 lorien clurgmgrd[25394]: <crit> #13: Service service:siemmgmt-svc
failed to stop cleanly
Jan 10 17:51:09 lorien clurgmgrd[25394]: <debug> Handling failure request for RG
service:siemmgmt-svc
Jan 10 17:51:19 lorien clurgmgrd[25394]: <debug> 2 events processed
And this is the output using rg_test command:
[root@lorien ~]# rg_test test /etc/cluster/cluster.conf start service siemmgmt-svc
Running in test mode.
Starting siemmgmt-svc...
<warn> Unknown file system type 'ext4' for device /dev/inasvol/splunkvol.
Assuming fsck is required.
<debug> Running fsck on /dev/inasvol/splunkvol
<info> mounting /dev/inasvol/splunkvol on /data/services/siem/splunk
<debug> mount -t ext4 -o rw /dev/inasvol/splunkvol /data/services/siem/splunk
<debug> Link for eth0: Detected
<info> Adding IPv4 address 172.25.70.22/28 to eth0
<debug> Pinging addr 172.25.70.22 from dev eth0
<debug> Sending gratuitous ARP: 172.25.70.22 00:50:56:14:5a:1e brd ff:ff:ff:ff:ff:ff
<info> Executing /data/config/etc/init.d/splunk-cluster start
+ . /etc/init.d/functions
++ TEXTDOMAIN=initscripts
++ umask 022
++ PATH=/sbin:/usr/sbin:/bin:/usr/bin
++ export PATH
++ '[' -z '' ']'
++ COLUMNS=80
++ '[' -z '' ']'
+++ /sbin/consoletype
++ CONSOLETYPE=pty
++ '[' -f /etc/sysconfig/i18n -a -z '' ']'
++ . /etc/profile.d/lang.sh
+++ sourced=0
+++ for langfile in /etc/sysconfig/i18n '$HOME/.i18n'
+++ '[' -f /etc/sysconfig/i18n ']'
+++ . /etc/sysconfig/i18n
++++ LANG=en_US.UTF-8
++++ SYSFONT=latarcyrheb-sun16
+++ sourced=1
+++ for langfile in /etc/sysconfig/i18n '$HOME/.i18n'
+++ '[' -f /.i18n ']'
+++ '[' -n '' ']'
+++ '[' 1 = 1 ']'
+++ '[' -n en_US.UTF-8 ']'
+++ export LANG
+++ '[' -n '' ']'
+++ unset LC_ADDRESS
+++ '[' -n '' ']'
+++ unset LC_CTYPE
+++ '[' -n '' ']'
+++ unset LC_COLLATE
+++ '[' -n '' ']'
+++ unset LC_IDENTIFICATION
+++ '[' -n '' ']'
+++ unset LC_MEASUREMENT
+++ '[' -n '' ']'
+++ unset LC_MESSAGES
+++ '[' -n '' ']'
+++ unset LC_MONETARY
+++ '[' -n '' ']'
+++ unset LC_NAME
+++ '[' -n '' ']'
+++ unset LC_NUMERIC
+++ '[' -n '' ']'
+++ unset LC_PAPER
+++ '[' -n '' ']'
+++ unset LC_TELEPHONE
+++ '[' -n '' ']'
+++ unset LC_TIME
+++ '[' -n C ']'
+++ '[' C '!=' en_US.UTF-8 ']'
+++ export LC_ALL
+++ '[' -n '' ']'
+++ unset LANGUAGE
+++ '[' -n '' ']'
+++ unset LINGUAS
+++ '[' -n '' ']'
+++ unset _XKB_CHARSET
+++ consoletype=pty
+++ '[' -z pty ']'
+++ '[' -n '' ']'
+++ '[' -n '' ']'
+++ '[' -n en_US.UTF-8 ']'
+++ case $LANG in
+++ '[' dumb = linux ']'
+++ unset SYSFONTACM SYSFONT
+++ unset sourced
+++ unset langfile
++ '[' -z '' ']'
++ '[' -f /etc/sysconfig/init ']'
++ . /etc/sysconfig/init
+++ BOOTUP=color
+++ GRAPHICAL=yes
+++ RES_COL=60
+++ MOVE_TO_COL='echo -en \033[60G'
+++ SETCOLOR_SUCCESS='echo -en \033[0;32m'
+++ SETCOLOR_FAILURE='echo -en \033[0;31m'
+++ SETCOLOR_WARNING='echo -en \033[0;33m'
+++ SETCOLOR_NORMAL='echo -en \033[0;39m'
+++ LOGLEVEL=3
+++ PROMPT=yes
+++ AUTOSWAP=no
++ '[' pty = serial ']'
++ '[' color '!=' verbose ']'
++ INITLOG_ARGS=-q
++
__sed_discard_ignored_files='/\(~\|\.bak\|\.orig\|\.rpmnew\|\.rpmorig\|\.rpmsave\)$/d'
+ '[' '!' -d /data/services/siem/splunk/etc ']'
+ HOME=/data/services/siem/splunk
+ DIRECTORY=/data/services/siem/splunk
+ export HOME
+ case "$1" in
+ start
+ echo -n 'Starting Splunk: '
Starting Splunk:
+ sudo -H -u splunk /data/services/siem/splunk/bin/splunk start
Splunk> The IT Search Engine.
Checking prerequisites...
Checking http port [172.25.70.22:9000]: open
Checking mgmt port [172.25.70.22:9089]: open
Checking configuration... Done.
Checking index directory... Done.
Checking databases...
Validated databases: _audit, _blocksignature, _internal, _thefishbucket, history,
main, sample, summary
Checking for SELinux.
All preliminary checks passed.
[ OK ]
[ OK ]
Starting splunk server daemon (splunkd)... Done.Starting splunkweb... Done.
If you get stuck, we're here to help.
Look for answers here: http://www.splunk.com/base/Documentation
The Splunk web interface is at https://172.25.70.22:9000
+ RETVAL=0
+ '[' 0 -eq 0 ']'
+ success
+ '[' color '!=' verbose -a -z '' ']'
+ echo_success
+ '[' color = color ']'
+ echo -en '\033[60G'
+ echo -n '['
[+ '[' color = color ']'
+ echo -en '\033[0;32m'
+ echo -n ' OK '
OK + '[' color = color ']'
+ echo -en '\033[0;39m'
+ echo -n ']'
]+ echo -ne '\r'
+ return 0
+ return 0
+ echo
+ return 0
+ exit 0
Start of siemmgmt-svc complete
As you can see, all works ok.
Service configuration under cluster.conf:
<service autostart="0" domain="PriCluster2" name="siemmgmt-svc" recovery="relocate">
<fs ref="siemdata">
<ip ref="172.25.70.22">
<script ref="splunk-cluster"/>
</ip>
</fs>
</service>
Script:
#!/bin/sh -x
# Splunk: Controls Splunk on Redhat-based systems
#
# chkconfig: 2345 99 15
# description: Starts and stops Splunk
#
# This will work on Redhat systems (maybe others too)
# Source function library.
. /etc/init.d/functions
if [ ! -d /data/services/siem/splunk/etc ]; then
exit 1
fi
HOME="/data/services/siem/splunk"
DIRECTORY="/data/services/siem/splunk"
export HOME
start() {
echo -n "Starting Splunk: "
sudo -H -u splunk ${DIRECTORY}/bin/splunk start > /dev/null
RETVAL=$?
if [ $RETVAL -eq 0 ]; then
success
else
failure
fi
echo
return $RETVAL
}
stop() {
echo -n "Stopping Splunk: "
sudo -H -u splunk ${DIRECTORY}/bin/splunk stop > /dev/null
RETVAL=$?
if [ $RETVAL -eq 0 ]; then
success
else
failure
fi
echo
return $RETVAL
}
status() {
exit 0
}
case "$1" in
start)
start
;;
stop)
stop
;;
restart)
stop
start
;;
status)
status
;;
*)
echo $"Usage: $0 {start|stop|restart|status}"
exit 1
esac
exit $?
How can I debug this error?? I don't why fails when is launched via rgmanager ...
--
CL Martinez
carlopmart {at} gmail {d0t} com
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster