Hi, I'm in the process of migrating a cluster of two nodes to two virtual machines. The real servers have clumanager-1.0.28-1 (RHEL3/CentOS3). I've migrated all the filesystems and started of the process of reconfiguring the cluster. The real servers clustat: Cluster Status Monitor (Cluster) 11:50:14 Cluster alias: Not Configured ========================= M e m b e r S t a t u s ========================== Member Status Node Id Power Switch -------------- ---------- ---------- ------------ cl1 Up 0 Good cl2 Up 1 Good ========================= H e a r t b e a t S t a t u s ==================== Name Type Status ------------------------------ ---------- ------------ cl1 <--> cl2 network ONLINE cln1 <--> cln2 network ONLINE ========================= S e r v i c e S t a t u s ======================== Last Monitor Restart Service Status Owner Transition Interval Count -------------- -------- -------------- ---------------- -------- ------- mysql1 started cl2 00:16:28 Oct 23 10 1 nfs started cl2 23:20:58 Oct 08 10 0 Everything is about the same in the virtual cluster, except that they don't have any powerwitch, there is only one network. They both use network and quorum to check if the other node is ok. The problem is in the virtual cluster. I've upgraded to clumanager-1.2.34-3 in the virtual cluster to check if it was an bug in the previous one. Both nodes can't see each other through the network. They think the other is Inactive. As i start cl1 clumanager i get: Jan 25 11:52:22 cl1 clumanager: [15039]: <notice> Starting Red Hat Cluster Manager... Jan 25 11:52:22 cl1 cluquorumd[15053]: <warning> STONITH: No drivers configured for host 'cl1'! Jan 25 11:52:22 cl1 cluquorumd[15053]: <warning> STONITH: Data integrity may be compromised! Jan 25 11:52:22 cl1 cluquorumd[15053]: <warning> STONITH: No drivers configured for host 'cl2'! Jan 25 11:52:22 cl1 cluquorumd[15053]: <warning> STONITH: Data integrity may be compromised! Jan 25 11:52:22 cl1 clumanager: cluquorumd startup succeeded Jan 25 11:52:33 cl1 clumembd[15056]: <notice> Member cl1 UP Jan 25 11:52:34 cl1 cluquorumd[15054]: <notice> Quorum Formed; Starting Service Manager Jan 25 11:52:34 cl1 clusvcmgrd: [15067]: <notice> service notice: Stopping service mysql ... Jan 25 11:52:35 cl1 clusvcmgrd: [15067]: <notice> service notice: Running user script '/etc/init.d/mysql1 stop' Jan 25 11:52:37 cl1 clusvcmgrd: [15067]: <notice> service notice: Stopped service mysql ... Jan 25 11:52:37 cl1 clusvcmgrd: [15244]: <notice> service notice: Stopping service nfs ... Jan 25 11:52:37 cl1 clusvcmgrd: [15244]: <notice> service notice: Stopped service nfs ... Jan 25 11:52:37 cl1 clusvcmgrd[15381]: <notice> Starting stopped service mysql Jan 25 11:52:37 cl1 clusvcmgrd[15395]: <notice> Starting stopped service nfs Jan 25 11:52:37 cl1 clusvcmgrd: [15382]: <notice> service notice: Starting service mysql ... Jan 25 11:52:37 cl1 clusvcmgrd: [15420]: <notice> service notice: Starting service nfs ... Jan 25 11:52:37 cl1 kernel: kjournald starting. Commit interval 5 seconds Jan 25 11:52:37 cl1 kernel: EXT3 FS on hda5, internal journal Jan 25 11:52:37 cl1 kernel: EXT3-fs: mounted filesystem with ordered data mode. Jan 25 11:52:37 cl1 /sbin/hotplug: no runnable /etc/hotplug/block.agent is installed Jan 25 11:52:38 cl1 clusvcmgrd: [15382]: <notice> service notice: Running user script '/etc/init.d/mysql1 start' Jan 25 11:52:38 cl1 clusvcmgrd: [15382]: <notice> service notice: Started service mysql ... Jan 25 11:52:38 cl1 clusvcmgrd: [15420]: <notice> service notice: Started service nfs ... Everything seems ok... Then i start cl2's clumanager: cl2 -bash: (1836) [root.root] |.| /etc/init.d/clumanager start Jan 25 11:54:56 cl2 clumanager: [7651]: <notice> Starting Red Hat Cluster Manager... Jan 25 11:54:56 cl2 cluquorumd[7665]: <warning> STONITH: No drivers configured for host 'cl1'! Jan 25 11:54:56 cl2 cluquorumd[7665]: <warning> STONITH: Data integrity may be compromised! Jan 25 11:54:56 cl2 cluquorumd[7665]: <warning> STONITH: No drivers configured for host 'cl2'! Jan 25 11:54:56 cl2 cluquorumd[7665]: <warning> STONITH: Data integrity may be compromised! Jan 25 11:54:56 cl2 clumanager: cluquorumd startup succeeded Jan 25 11:55:07 cl2 clumembd[7670]: <notice> Member cl2 UP Jan 25 11:55:08 cl2 cluquorumd[7666]: <warning> Membership reports #0 as down, but disk reports as up: State uncertain! Jan 25 11:55:08 cl2 cluquorumd[7666]: <notice> Quorum Formed; Starting Service Manager Jan 25 11:55:08 cl2 clusvcmgrd: [7679]: <notice> service notice: Stopping service mysql ... Jan 25 11:55:08 cl2 clusvcmgrd: [7679]: <notice> service notice: Running user script '/etc/init.d/mysql1 stop' Jan 25 11:55:10 cl2 clusvcmgrd: [7679]: <notice> service notice: Stopped service mysql ... Jan 25 11:55:10 cl2 clusvcmgrd: [7856]: <notice> service notice: Stopping service nfs ... Jan 25 11:55:10 cl2 clusvcmgrd: [7856]: <notice> service notice: Stopped service nfs ... Now we have a problem... "cluquorumd[7666]: <warning> Membership reports #0 as down, but disk reports as up: State uncertain!" Clustat from cl1 reports: Cluster Status - Cluster 11:54:16 Cluster Quorum Incarnation #1 Shared State: Shared Raw Device Driver v1.2 Member Status ------------------ ---------- cl1 Active <-- You are here cl2 Inactive Service Status Owner (Last) Last Transition Chk Restarts -------------- -------- ---------------- --------------- --- -------- mysql started cl1 11:52:37 Jan 25 20 0 nfs started cl1 11:52:37 Jan 25 0 0 Clustat from cl2 reports: Cluster Status - Cluster 11:56:30 Cluster Quorum Incarnation #1 Shared State: Shared Raw Device Driver v1.2 Member Status ------------------ ---------- cl1 Inactive cl2 Active <-- You are here Service Status Owner (Last) Last Transition Chk Restarts -------------- -------- ---------------- --------------- --- -------- mysql started cl1 11:52:37 Jan 25 20 0 nfs started cl1 11:52:37 Jan 25 0 0 I have network connectivity working: [root@cl1 root]# ping -c2 -s30000 cl2 PING cl2 (172.30.5.112) 30000(30028) bytes of data. 30008 bytes from cl2 (172.30.5.112): icmp_seq=0 ttl=64 time=1.08 ms 30008 bytes from cl2 (172.30.5.112): icmp_seq=1 ttl=64 time=1.09 ms [root@cl2 root]# ping -c2 -s30000 cl1 PING cl1 (172.30.5.111) 30000(30028) bytes of data. 30008 bytes from cl1 (172.30.5.111): icmp_seq=0 ttl=64 time=1.09 ms 30008 bytes from cl1 (172.30.5.111): icmp_seq=1 ttl=64 time=0.998 ms Quorum seems ok, but network doesn't. [root@cl1 root]# shutil -p /cluster/header /cluster/header is 144 bytes long SharedStateHeader { ss_magic = 0x39119fcd ss_timestamp = 0x000000004798e63b (19:25:47 Jan 24 2008) ss_updateHost = cl1.datacenter.imoportal.pt } [root@cl2 root]# shutil -p /cluster/header /cluster/header is 144 bytes long SharedStateHeader { ss_magic = 0x39119fcd ss_timestamp = 0x000000004798e63b (19:25:47 Jan 24 2008) ss_updateHost = cl1.datacenter.imoportal.pt } Any ideas? Thanks Nuno Fernandes -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster