David Teigland wrote:
On Fri, Sep 21, 2007 at 05:50:09PM +0200, carlopmart wrote:
David Teigland wrote:
On Fri, Sep 21, 2007 at 05:29:22PM +0200, carlopmart wrote:
[root@thranduil log]# mount -t gfs /dev/xvdc1 /data
/sbin/mount.gfs: lock_dlm_join: gfs_controld join error: -22
/sbin/mount.gfs: error mounting lockproto lock_dlm
This has already been changed to report a descriptive error message,
"node not a member of the default fence domain"
as is shown in the debug log from gfs_controld below, and I suspect
appears in your /var/log/messages.
1190388485 mount: not in default fence domain
1190388485 datavol01 do_mount: rv -22
[root@thranduil log]# group_tool -v; group_tool dump gfs
type level name id state node id local_done
fence 0 default 00010001 JOIN_START_WAIT 1 100010001 0
[1]
This shows it's not in the fence domain yet. The reason appears to be
that it's trying to fence someone. Again, look in /var/log/messages to
find out more information about what needs to be fenced, or why fencing
isn't working.
Dave
Correct Dave. Error is:
Sep 21 16:51:25 thranduil fenced[1081]: fencing node "elrond.hpulabs.org"
Sep 21 16:51:25 thranduil fenced[1081]: fence "elrond.hpulabs.org" failed
And it is ok. "elrond.hpulabs.org" is the node that I can't startup
(it is on maintenance hardware until monday). I need to start all other
cluster services under thranduil and haldir .... Is it possible???
Two options:
1. Remove that node from cluster.conf so it's not fenced every time the
cluster starts up.
2. Manually override/ack the fencing operation every time it happens with:
fence_ack_manual -n elrond.hpulabs.org. This will allow things to
continue.
Dave
First option it isn't possible because I can't restore cluster.conf
when elrond comes up on the other two nodes.
Second option returns me this error:
[root@thranduil ~]# clustat
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
thranduil.hpulabs.org 1 Online, Local, rgmanager
haldir.hpulabs.org 2 Online, rgmanager
elrond.hpulabs.org 3 Offline
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:rsync-svc (none) stopped
service:wwwsoft-svc (none) stopped
service:proxy-svc (thranduil.hpulabs.org) stopped
service:mail-svc (none) stopped
[root@thranduil ~]# fence_ack_manual -n elrond.hpulabs.org
Warning: If the node "elrond.hpulabs.org" has not been manually fenced
(i.e. power cycled or disconnected from shared storage devices)
the GFS file system may become corrupted and all its data
unrecoverable! Please verify that the node shown above has
been reset or disconnected from storage.
Are you certain you want to continue? [yN] y
can't open /tmp/fence_manual.fifo: No such file or directory
--
CL Martinez
carlopmart {at} gmail {d0t} com
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster