Hi, RHEL
5.4: cluster2 (I think). I
expected to be able to freeze a service on a node and restart rgmanager on that
node without interrupting the service. In practice, starting
rgmanager causes the service to be stopped. Is
this what is supposed to happen ? I thought the whole point of freezing
services was to allow maintenance (including restarting cluster software). Are
there any options to prevent the services from being stopped when rgmanager is
started ? One
effect of rgmanager stopping the service is that the cluster reaches an
inconsistent state. Once rgmanager has restarted, the cluster believes
that the services are still frozen, where in reality they are
stopped. Any attempt to unfreeze the service causes the service to
failover to a standby node. regards, Martin sudo
/usr/sbin/clustat Cluster
Status for EDISV1DBM @ Mon Jun 21 16:27:05 2010 Member
Status: Quorate Member
Name
ID Status ------
----
---- ------ svXprdclu001
1 Online, rgmanager svXprdclu002
2 Online, Local, rgmanager svXprdclu003
3 Online, rgmanager svXprdclu004
4 Online, rgmanager svXprdclu005
5 Online, rgmanager Service
Name
Owner (Last)
State -------
----
----- ------
----- service:ACTIVESITE
svXprdclu002
started service:MASTERVIP
svXprdclu002 started [martin@cp1edidbm002
~]$ sudo /usr/sbin/clusvcadm -Z ACTIVESITE Local
machine freezing service:ACTIVESITE...Success [martin@cp1edidbm002
~]$ sudo /usr/sbin/clusvcadm -Z MASTERVIP Local
machine freezing service:MASTERVIP...Success [martin@cp1edidbm002
~]$ sudo /usr/sbin/clustat Cluster
Status for EDISV1DBM @ Mon Jun 21 16:34:02 2010 Member
Status: Quorate Member
Name
ID Status ------
----
---- ------ svXprdclu001
1 Online, rgmanager svXprdclu002
2 Online, Local, rgmanager svXprdclu003
3
Online, rgmanager svXprdclu004
4 Online, rgmanager svXprdclu005
5 Online, rgmanager Service
Name
Owner (Last)
State -------
----
----- ------
----- service:ACTIVESITE
svXprdclu002
started [Z] service:MASTERVIP
svXprdclu002
started [Z] [martin@cp1edidbm002
~]$ sudo /etc/init.d/rgmanager stop Shutting
down Cluster Service Manager... Waiting
for services to
stop:
[ OK ] Cluster
Service Manager is stopped. [martin@cp1edidbm002
~]$ sudo /etc/init.d/rgmanager start Starting
Cluster Service
Manager:
[ OK ] # #
the services are stopped by rgmanager start. Ugh! # [martin@cp1edidbm002
~]$ sudo /usr/sbin/clustat Cluster
Status for EDISV1DBM @ Mon Jun 21 16:35:34 2010 Member
Status: Quorate Member
Name
ID Status ------
----
---- ------ svXprdclu001
1 Online, rgmanager svXprdclu002
2 Online, Local, rgmanager svXprdclu003
3 Online, rgmanager svXprdclu004
4 Online, rgmanager svXprdclu005
5 Online, rgmanager Service
Name
Owner (Last)
State -------
----
----- ------
----- service:ACTIVESITE
svXprdclu002
started [Z] service:MASTERVIP
svXprdclu002
started [Z] ========================================= The
logs show that the service is stopped as rgmanager is started on svXprdclu002.
Jun
21 16:31:19 cp1edidbm002 clurgmgrd: [14256]: <info> Executing
/home/martin/dc-dsm status Jun
21 16:34:58 cp1edidbm002 rgmanager: [15526]: <notice> Shutting down
Cluster Service Manager... Jun
21 16:34:58 cp1edidbm002 clurgmgrd[14256]: <notice> Shutting down Jun
21 16:35:08 cp1edidbm002 clurgmgrd[14256]: <notice> Shutdown complete,
exiting Jun
21 16:35:08 cp1edidbm002 rgmanager: [15526]: <notice> Cluster Service
Manager is stopped. Jun
21 16:35:16 cp1edidbm002 kernel: dlm: Using TCP for communications Jun
21 16:35:16 cp1edidbm002 kernel: dlm: got connection from 4 Jun
21 16:35:16 cp1edidbm002 kernel: dlm: got connection from 5 Jun
21 16:35:16 cp1edidbm002 kernel: dlm: got connection from 1 Jun
21 16:35:16 cp1edidbm002 kernel: dlm: got connection from 3 Jun
21 16:35:17 cp1edidbm002 clurgmgrd[15574]: <notice> Resource Group
Manager Starting Jun
21 16:35:17 cp1edidbm002 clurgmgrd[15574]: <info> Loading Service Data Jun
21 16:35:17 cp1edidbm002 clurgmgrd[15574]: <info> Initializing Services Jun
21 16:35:17 cp1edidbm002 clurgmgrd: [15574]: <info> Executing /bin/true
stop Jun
21 16:35:17 cp1edidbm002 clurgmgrd: [15574]: <info> Removing IPv4 address
10.3.17.20/24 from bond0 Jun
21 16:35:27 cp1edidbm002 clurgmgrd: [15574]: <info> Executing
/home/martin/dc-dsm stop Jun
21 16:35:27 cp1edidbm002 clurgmgrd[15574]: <info> Services Initialized Jun
21 16:35:27 cp1edidbm002 clurgmgrd[15574]: <info> State change: Local UP Jun
21 16:35:27 cp1edidbm002 clurgmgrd[15574]: <info> State change:
svXprdclu001 UP Jun
21 16:35:27 cp1edidbm002 clurgmgrd[15574]: <info> State change:
svXprdclu003 UP Jun
21 16:35:27 cp1edidbm002 clurgmgrd[15574]: <info> State change:
svXprdclu004 UP Jun
21 16:35:27 cp1edidbm002 clurgmgrd[15574]: <info> State change:
svXprdclu005 UP |
-- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster