On Wed, Jan 12, 2005 at 05:10:40PM +0300, Sergey wrote: > Hello! > > > It looks like you are not using pool. > > Thanks, I've guided by your examples, so raid can be mounted. > > Now I have some questions about Cluster Configuration System Files. > > I have 2 nodes - hp1 and hp2. Any of nodes have Integrated Lights-Out > with ROM Version: 1.55 - 04/16/2004. > > Since I have only 2 nodes one of them has to be master, but if first > of them (master) is correctly shut down, slave experiencing > serious problems which can be solved by resetting. Is it all right? > How to make it right? > > I tried to make servers = ["hp1","hp2","hp3"] (hp3 is really absent), > then if master is shut down second node became master. So, if The nodes in the servers config line for gulm form a mini-cluster of sorts. There must be quorum (51%) of nodes present in this mini-cluster for things to continue. You must have two of the three servers up and running so that the mini-cluster has quorum, which then will alow the other nodes to connect. > nodes are alternately correctly shut down and boot up master is > switching from one to another and everything seems ok, but if one of > the nodes is shut down incorrectly (e.g. power cord is pulled out of > socket), this have written in systemlog: > > Jan 12 14:44:33 hp1 lock_gulmd_core[6500]: hp2 missed a heartbeat (time:1105530273952756 mb:1) > Jan 12 14:44:48 hp1 lock_gulmd_core[6500]: hp2 missed a heartbeat (time:1105530288972780 mb:2) > Jan 12 14:45:03 hp1 lock_gulmd_core[6500]: hp2 missed a heartbeat (time:1105530303992751 mb:3) > Jan 12 14:45:03 hp1 lock_gulmd_core[6500]: Client (hp2) expired > Jan 12 14:45:03 hp1 lock_gulmd_core[6500]: Core lost slave quorum. Have 1, need 2. Switching to Arbitrating. > Jan 12 14:45:03 hp1 lock_gulmd_core[6614]: Gonna exec fence_node hp2 > Jan 12 14:45:03 hp1 lock_gulmd_core[6500]: Forked [6614] fence_node hp2 with a 0 pause. > Jan 12 14:45:03 hp1 fence_node[6614]: Performing fence method, riloe, on hp2. > Jan 12 14:45:04 hp1 fence_node[6614]: The agent (fence_rib) reports: > Jan 12 14:45:04 hp1 fence_node[6614]: WARNING! fence_rib is deprecated. use fence_ilo instead parse error: unknown > option "ipaddr=10.10.0.112" > > If start again service lock_gulm on the second node, then on first > node this have written in systemlog: > > Jan 12 14:50:14 hp1 lock_gulmd_core[7148]: Gonna exec fence_node hp2 > Jan 12 14:50:14 hp1 fence_node[7148]: Performing fence method, riloe, on hp2. > Jan 12 14:50:14 hp1 fence_node[7148]: The agent (fence_rib) reports: > Jan 12 14:50:14 hp1 fence_node[7148]: WARNING! fence_rib is deprecated. use fence_ilo instead parse error: unknown > option "ipaddr=10.10.0.112" > Jan 12 14:50:14 hp1 fence_node[7148]: > Jan 12 14:50:14 hp1 fence_node[7148]: All fencing methods FAILED! > Jan 12 14:50:14 hp1 fence_node[7148]: Fence of "hp2" was unsuccessful. > Jan 12 14:50:14 hp1 lock_gulmd_core[6500]: Fence failed. [7148] Exit code:1 Running it again. > Jan 12 14:50:14 hp1 lock_gulmd_core[6500]: Forked [7157] fence_node hp2 with a 5 pause. > Jan 12 14:50:15 hp1 lock_gulmd_core[6500]: (10.10.0.201:hp2) Cannot login if you are expired. The node hp2 has to be successfully fenced before it is allowed to re-join the cluster. If your fencing is misconfigured or not working, a fenced node will never get to rejoin. You really should test that fencing works by running fence_node <node name> for each node in your cluster before running lock_gulmd. This makes sure that fencing is setup and working correctly. Do that, and once you've verified that fencing is correct (without lock_gulmd running) try things again with lock_gulmd. > And I can't umount GFS file system and can't reboot systems > because GFS is mounted, only reset both nodes. > > I think I have mistakes in my configuration, may be it is because > incorrect agent = "fence_rib" or something else. > > Please help :-) > > > Cluster Configuration: > > cluster.ccs: > cluster { > name = "cluster" > lock_gulm { > servers = ["hp1"] (or servers = ["hp1,"hp2","hp3"]) > } > } > > fence.ccs: > fence_devices { > ILO-HP1 { > agent = "fence_rib" > ipaddr = "10.10.0.111" > login = "xx" > passwd = "xx" > } > ILO-HP2 { > agent = "fence_rib" > ipaddr = "10.10.0.112" > login = "xx" > passwd = "xx" > } > } > > nodes.ccs: > nodes { > hp1 { > ip_interfaces { eth0 = "10.10.0.200" } > fence { riloe { ILO-HP1 { localport = 17988 } } } > } > hp2 { > ip_interfaces { eth0 = "10.10.0.201" } > fence { riloe { ILO-HP2 { localport = 17988 } } } > } > # if 3 nodes in cluster.ccs > # hp3 { > # ip_interfaces { eth0 = "10.10.0.201" } > # fence { riloe { ILO-HP2 { localport = 17988 } } } > # } -- Michael Conrad Tadpol Tilstra Hi, I'm an evil mutated signature virus, put me in your .sig or I will bite your kneecaps!
Attachment:
pgp8ni1svS70e.pgp
Description: PGP signature