Hi, I have problem in rhel5.5 cluster. Mysqld service is on cluster. when there is any issue with cluster, services(hell) not relocation automatically. Even I have tried to enable on second node but fails. In that case we need to reboot both nodes and enable it on manually on anyone. HP-ILO fencing is not working. Please find the below /var/log/message and suggest. Jun 9 02:46:25 indls0040 clurgmgrd[6530]: <notice> Stopping service service:hell Jun 9 02:46:27 indls0040 dhclient: DHCPREQUEST on eth7 to 10.48.64.13 port 67 Jun 9 02:46:44 indls0040 dhclient: DHCPREQUEST on eth7 to 10.48.64.13 port 67 Jun 9 02:46:45 indls0040 ccsd[5222]: Unable to connect to cluster infrastructure after 19710 seconds. Jun 9 02:46:55 indls0040 clurgmgrd[6530]: <err> #52: Failed changing RG status Jun 9 02:47:03 indls0040 dhclient: DHCPREQUEST on eth7 to 10.48.64.13 port 67 Jun 9 02:47:05 indls0040 clurgmgrd: [6530]: <warning> 10.48.64.82 is not configured Jun 9 02:47:05 indls0040 clurgmgrd[6530]: <notice> Stopping service service:hell Jun 9 02:47:15 indls0040 ccsd[5222]: Unable to connect to cluster infrastructure after 19740 seconds. Jun 9 02:47:20 indls0040 dhclient: DHCPREQUEST on eth7 to 10.48.64.13 port 67 Jun 9 02:47:35 indls0040 clurgmgrd[6530]: <err> #52: Failed changing RG status Jun 9 02:47:38 indls0040 dhclient: DHCPREQUEST on eth7 to 10.48.64.13 port 67 Jun 9 02:47:45 indls0040 clurgmgrd: [6530]: <warning> 10.48.64.82 is not configured Jun 9 02:47:45 indls0040 clurgmgrd[6530]: <notice> Stopping service service:hell Jun 9 02:47:45 indls0040 ccsd[5222]: Unable to connect to cluster infrastructure after 19770 seconds. Jun 9 02:47:50 indls0040 dhclient: DHCPREQUEST on eth7 to 10.48.64.13 port 67 Jun 9 02:48:14 indls0040 last message repeated 2 times Jun 9 02:48:15 indls0040 ccsd[5222]: Unable to connect to cluster infrastructure after 19800 seconds. Jun 9 02:48:15 indls0040 clurgmgrd[6530]: <err> #52: Failed changing RG status Jun 9 02:48:23 indls0040 dhclient: DHCPREQUEST on eth7 to 10.48.64.13 port 67 Jun 9 02:48:25 indls0040 clurgmgrd: [6530]: <warning> 10.48.64.82 is not configured Jun 9 02:48:25 indls0040 clurgmgrd[6530]: <notice> Stopping service service:hell Jun 9 02:48:37 indls0040 dhclient: DHCPREQUEST on eth7 to 10.48.64.13 port 67 Jun 9 02:48:45 indls0040 ccsd[5222]: Unable to connect to cluster infrastructure after 19830 seconds. Jun 9 02:48:55 indls0040 dhclient: DHCPREQUEST on eth7 to 10.48.64.13 port 67 Jun 9 02:48:55 indls0040 clurgmgrd[6530]: <err> #52: Failed changing RG status Jun 9 02:49:05 indls0040 clurgmgrd: [6530]: <warning> 10.48.64.82 is not configured Jun 9 02:49:05 indls0040 clurgmgrd[6530]: <notice> Stopping service service:hell Jun 9 02:49:13 indls0040 dhclient: DHCPREQUEST on eth7 to 10.48.64.13 port 67 Jun 9 02:49:15 indls0040 ccsd[5222]: Unable to connect to cluster infrastructure after 19860 seconds. Jun 9 02:49:26 indls0040 dhclient: DHCPREQUEST on eth7 to 10.48.64.13 port 67 Jun 9 02:49:35 indls0040 clurgmgrd[6530]: <err> #52: Failed changing RG status Jun 9 02:49:45 indls0040 clurgmgrd: [6530]: <warning> 10.48.64.82 is not configured Jun 9 02:49:45 indls0040 clurgmgrd[6530]: <notice> Stopping service service:hell Jun 9 02:49:45 indls0040 ccsd[5222]: Unable to connect to cluster infrastructure after 19890 seconds. Jun 9 02:49:47 indls0040 dhclient: DHCPREQUEST on eth7 to 10.48.64.13 port 67 Jun 9 02:50:10 indls0040 last message repeated 2 times Jun 9 02:50:15 indls0040 clurgmgrd[6530]: <err> #52: Failed changing RG status Jun 9 10:03:59 indls0040 openais[23169]: [MAIN ] Using default multicast address of 239.192.67.158 Jun 9 10:04:00 indls0040 openais[23169]: [TOTEM] Token Timeout (10000 ms) retransmit timeout (495 ms) Jun 9 10:04:00 indls0040 openais[23169]: [TOTEM] token hold (386 ms) retransmits before loss (20 retrans) Jun 9 10:04:00 indls0040 openais[23169]: [TOTEM] join (60 ms) send_join (0 ms) consensus (4800 ms) merge (200 ms) Jun 9 10:04:00 indls0040 openais[23169]: [TOTEM] downcheck (1000 ms) fail to recv const (50 msgs) Jun 9 10:04:00 indls0040 openais[23169]: [TOTEM] seqno unchanged const (30 rotations) Maximum network MTU 1402 Jun 9 10:04:00 indls0040 openais[23169]: [TOTEM] window size per rotation (50 messages) maximum messages per rotation (1 7 messages) Jun 9 10:04:00 indls0040 openais[23169]: [TOTEM] missed count const (5 messages) Jun 9 10:04:00 indls0040 openais[23169]: [TOTEM] send threads (0 threads) Jun 9 10:04:00 indls0040 openais[23169]: [TOTEM] RRP token expired timeout (495 ms) Jun 9 10:04:00 indls0040 openais[23169]: [TOTEM] RRP token problem counter (2000 ms) Jun 9 10:04:00 indls0040 openais[23169]: [TOTEM] RRP threshold (10 problem count) Jun 9 10:04:00 indls0040 openais[23169]: [TOTEM] RRP mode set to none. Jun 9 10:04:00 indls0040 openais[23169]: [TOTEM] heartbeat_failures_allowed (0) Jun 9 10:04:00 indls0040 openais[23169]: [TOTEM] max_network_delay (50 ms) Jun 9 10:04:00 indls0040 openais[23169]: [TOTEM] HeartBeat is Disabled. To enable set heartbeat_failures_allowed > 0 Jun 9 10:04:00 indls0040 openais[23169]: [TOTEM] Receive multicast socket recv buffer size (320000 bytes). Jun 9 10:04:00 indls0040 openais[23169]: [TOTEM] Transmit multicast socket send buffer size (262142 bytes). Jun 9 10:04:00 indls0040 openais[23169]: [TOTEM] The network interface [10.48.65.54] is now up. Jun 9 10:04:00 indls0040 openais[23169]: [TOTEM] Created or loaded sequence id 7136704.10.48.65.54 for this ring. Jun 9 10:04:00 indls0040 openais[23169]: [TOTEM] entering GATHER state from 15. Jun 9 10:04:00 indls0040 openais[23169]: [CMAN ] CMAN 2.0.115 (built Jul 28 2010 19:18:41) started Jun 9 10:04:00 indls0040 openais[23169]: [MAIN ] Service initialized 'openais CMAN membership service 2.01' Jun 9 10:04:00 indls0040 openais[23169]: [SERV ] Service initialized 'openais extended virtual synchrony service' Jun 9 10:04:00 indls0040 openais[23169]: [SERV ] Service initialized 'openais cluster membership service B.01.01' Jun 9 10:04:00 indls0040 openais[23169]: [SERV ] Service initialized 'openais availability management framework B.01.01' Jun 9 10:04:00 indls0040 openais[23169]: [SERV ] Service initialized 'openais checkpoint service B.01.01' Jun 9 10:04:00 indls0040 openais[23169]: [SERV ] Service initialized 'openais event service B.01.01' Jun 9 10:04:00 indls0040 openais[23169]: [SERV ] Service initialized 'openais distributed locking service B.01.01' Jun 9 10:04:00 indls0040 openais[23169]: [SERV ] Service initialized 'openais message service B.01.01' Jun 9 10:04:00 indls0040 openais[23169]: [SERV ] Service initialized 'openais configuration service' Jun 9 10:04:00 indls0040 openais[23169]: [SERV ] Service initialized 'openais cluster closed process group service v1.01 ' Jun 9 10:04:00 indls0040 openais[23169]: [SERV ] Service initialized 'openais cluster config database access v1.01' Jun 9 10:04:00 indls0040 openais[23169]: [SYNC ] Not using a virtual synchrony filter. Jun 9 10:04:00 indls0040 openais[23169]: [TOTEM] Creating commit token because I am the rep. --More-- Thanks- Shankar Jun 9 10:04:01 indls0040 openais[23169]: [CLM ] r(0) ip(10.48.64.67) Jun 9 10:04:01 indls0040 openais[23169]: [SYNC ] This node is within the primary component and will provide service. Jun 9 10:04:01 indls0040 openais[23169]: [TOTEM] entering OPERATIONAL state. Jun 9 10:04:02 indls0040 openais[23169]: [CLM ] got nodejoin message 10.48.64.67 Jun 9 10:04:02 indls0040 openais[23169]: [CLM ] got nodejoin message 10.48.65.54 Jun 9 10:04:02 indls0040 openais[23169]: [CMAN ] cman killed by node 2 because we were killed by cman_tool or other appl ication Jun 9 10:04:03 indls0040 openais[23169]: [SERV ] Unloading all openais components Jun 9 10:04:03 indls0040 openais[23169]: [SERV ] Unloading openais component: openais_confdb v0 (19/10) Jun 9 10:04:03 indls0040 openais[23169]: [SERV ] Unloading openais component: openais_cpg v0 (18/8) Jun 9 10:04:03 indls0040 openais[23169]: [SERV ] Unloading openais component: openais_cfg v0 (17/7) Jun 9 10:04:03 indls0040 openais[23169]: [SERV ] Unloading openais component: openais_msg v0 (16/6) Jun 9 10:04:03 indls0040 openais[23169]: [SERV ] Unloading openais component: openais_lck v0 (15/5) Jun 9 10:04:03 indls0040 openais[23169]: [SERV ] Unloading openais component: openais_evt v0 (14/4) Jun 9 10:04:03 indls0040 openais[23169]: [SERV ] Unloading openais component: openais_ckpt v0 (13/3) Jun 9 10:04:03 indls0040 openais[23169]: [SERV ] Unloading openais component: openais_amf v0 (12/2) Jun 9 10:04:03 indls0040 openais[23169]: [SERV ] Unloading openais component: openais_clm v0 (11/1) Jun 9 10:04:03 indls0040 openais[23169]: [SERV ] Unloading openais component: openais_evs v0 (10/0) Jun 9 10:04:03 indls0040 openais[23169]: [SERV ] Unloading openais component: openais_cman v0 (9/9) Jun 9 10:04:03 indls0040 dlm_controld[23196]: cluster is down, exiting Jun 9 10:04:03 indls0040 fenced[23188]: cluster is down, exiting Jun 9 10:04:03 indls0040 kernel: dlm: closing connection to node 1 Jun 9 10:04:03 indls0040 gfs_controld[23203]: cpg_join error 2 Jun 9 10:04:06 indls0040 fence_node[23194]: Fence of "indls0040.qdx.in" was unsuccessful Jun 9 10:04:15 indls0040 ccsd[5222]: Unable to connect to cluster infrastructure after 45930 seconds. Jun 9 10:04:16 indls0040 clurgmgrd[6530]: <err> #52: Failed changing RG status
Attachment:
logs.docx
Description: application/vnd.openxmlformats-officedocument.wordprocessingml.document
-- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster