Hi all!
I have a problem with 3 nodes cluster. When I run "fence_node node1" the
node1 reeboot by drac succesfully. When node1 restarts then gets frozen:
------------------
starting clvmd: dlm: got connection fron 32
dlm: connecting to 33
dlm: got connection fron 33
[frozen]
* cman_tool services shows:
type level name id state
fence 0 default 0001001f none
[31 32 33]
dlm 1 clvmd 00010020 none
[31 32 33]
dlm 1 rgmanager 00020020 none
[32 33]
It seems rgmanager has not 31 (?)
* clustat shows:
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
xenr3u1.domain.com 31 Online
xenr3u2.domain.com 32 Online, Local
xenr3u3.domain.com 33 Online
-------------------
Then I rebooted again the node1:
Starting cluster
Loading modules DLM .......
done
starting ccsd
starting cman
starting daemons
starting fencing
[frozen again]
after long time starting fencing [done] but cman_tool services fails
* cman_tool services shows:
type level name id state
fence 0 default 0001001f FAIL_ALL_STOPPED
[31 32 33]
dlm 1 clvmd 00010020 FAIL_STOP_WAIT
[31 32 33]
dlm 1 rgmanager 00020020 FAIL_STOP_WAIT
* clustat shows:
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
xenr3u1.domain.com 31 Online
xenr3u2.domain.com 32 Online, Local
xenr3u3.domain.com 33 Online
/etc/init.d/rgmanager restart
Shutting down Cluster Service Manager...
Waiting for services to stop:
[long timeeeeeeee]
----------------------------------
I saw this page translated to english
(http://translate.google.com/translate?u=http%3A%2F%2Fken-etsu-tech.blogspot.com%2F2007%2F11%2Fred-hat-cluster-kernel-xen.html&langpair=ja%7Cen&hl=es&ie=UTF-8).
It's exactly the same. A kernel bug? clvmd bug?
Linux xenr3u2 2.6.18-8.1.15.el5xen #1 SMP Mon Oct 22 09:01:12 EDT 2007
x86_64 x86_64 x86_64 GNU/Linux
cman-2.0.64-1.0.1.el5
rgmanager-2.0.24-1.el5.centos
lvm2-cluster-2.02.16-3.el5
Sometimes the node starts ok and cman_tool is also ok.
* /etc/lvm.conf:
devices {
dir = "/dev"
scan = [ "/dev" ]
filter = [ "a/.*/" ]
cache = "/etc/lvm/.cache"
write_cache_state = 1
sysfs_scan = 1
md_component_detection = 1
}
log {
verbose = 0
syslog = 1
overwrite = 0
level = 0
indent = 1
command_names = 0
prefix = " "
}
backup {
backup = 1
backup_dir = "/etc/lvm/backup"
archive = 1
archive_dir = "/etc/lvm/archive"
retain_min = 10
retain_days = 30
}
shell {
history_size = 100
}
global {
library_dir = "/usr/lib64"
umask = 077
test = 0
activation = 1
proc = "/proc"
locking_type = 3
fallback_to_clustered_locking = 1
fallback_to_local_locking = 1
locking_dir = "/var/lock/lvm"
}
activation {
missing_stripe_filler = "/dev/ioerror"
reserved_stack = 256
reserved_memory = 8192
process_priority = -18
mirror_region_size = 512
mirror_log_fault_policy = "allocate"
mirror_device_fault_policy = "remove"
}
That's all ;-)
Thanks in advance
begin:vcard
fn:Jorge Gonzalez y Hurtado de Mendoza
n:Gonzalez y Hurtado de Mendoza;Jorge
org:DEGESYS
adr;quoted-printable:Edif 3 Plt 3=C2=AA;;Av de la Vega 15;Alcobendas;Madrid;28100;Spain
email;internet:jorge.gonzalez@xxxxxxxxxxx
title:Tecnico de Sistemas
tel;work:+34911517194
tel;fax:+34911517199
url:http://www.degesys.com
version:2.1
end:vcard
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster