hello, sorry to ask but is the "none" state a normal state for services? I have issues with cluster services too and I've been told that this state is not normal and indicates that the nodes didn't join the fence domain that causing issues with rgmanager too. what does show clustat and cman_tool services at startup ? regards, Mathieu -----Message d'origine----- De : linux-cluster-bounces@xxxxxxxxxx [mailto:linux-cluster-bounces@xxxxxxxxxx] De la part de Jorge Gonzalez Envoyé : jeudi 10 janvier 2008 17:18 À : linux-cluster@xxxxxxxxxx Objet : Cluster fails after fencing by DRAC Hi all! I have a problem with 3 nodes cluster. When I run "fence_node node1" the node1 reeboot by drac succesfully. When node1 restarts then gets frozen: ------------------ starting clvmd: dlm: got connection fron 32 dlm: connecting to 33 dlm: got connection fron 33 [frozen] * cman_tool services shows: type level name id state fence 0 default 0001001f none [31 32 33] dlm 1 clvmd 00010020 none [31 32 33] dlm 1 rgmanager 00020020 none [32 33] It seems rgmanager has not 31 (?) * clustat shows: Member Status: Quorate Member Name ID Status ------ ---- ---- ------ xenr3u1.domain.com 31 Online xenr3u2.domain.com 32 Online, Local xenr3u3.domain.com 33 Online ------------------- Then I rebooted again the node1: Starting cluster Loading modules DLM ....... done starting ccsd starting cman starting daemons starting fencing [frozen again] after long time starting fencing [done] but cman_tool services fails * cman_tool services shows: type level name id state fence 0 default 0001001f FAIL_ALL_STOPPED [31 32 33] dlm 1 clvmd 00010020 FAIL_STOP_WAIT [31 32 33] dlm 1 rgmanager 00020020 FAIL_STOP_WAIT * clustat shows: Member Status: Quorate Member Name ID Status ------ ---- ---- ------ xenr3u1.domain.com 31 Online xenr3u2.domain.com 32 Online, Local xenr3u3.domain.com 33 Online /etc/init.d/rgmanager restart Shutting down Cluster Service Manager... Waiting for services to stop: [long timeeeeeeee] ---------------------------------- I saw this page translated to english (http://translate.google.com/translate?u=http%3A%2F%2Fken-etsu-tech.blogspot.com%2F2007%2F11%2Fred-hat-cluster-kernel-xen.html&langpair=ja%7Cen&hl=es&ie=UTF-8). It's exactly the same. A kernel bug? clvmd bug? Linux xenr3u2 2.6.18-8.1.15.el5xen #1 SMP Mon Oct 22 09:01:12 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux cman-2.0.64-1.0.1.el5 rgmanager-2.0.24-1.el5.centos lvm2-cluster-2.02.16-3.el5 Sometimes the node starts ok and cman_tool is also ok. * /etc/lvm.conf: devices { dir = "/dev" scan = [ "/dev" ] filter = [ "a/.*/" ] cache = "/etc/lvm/.cache" write_cache_state = 1 sysfs_scan = 1 md_component_detection = 1 } log { verbose = 0 syslog = 1 overwrite = 0 level = 0 indent = 1 command_names = 0 prefix = " " } backup { backup = 1 backup_dir = "/etc/lvm/backup" archive = 1 archive_dir = "/etc/lvm/archive" retain_min = 10 retain_days = 30 } shell { history_size = 100 } global { library_dir = "/usr/lib64" umask = 077 test = 0 activation = 1 proc = "/proc" locking_type = 3 fallback_to_clustered_locking = 1 fallback_to_local_locking = 1 locking_dir = "/var/lock/lvm" } activation { missing_stripe_filler = "/dev/ioerror" reserved_stack = 256 reserved_memory = 8192 process_priority = -18 mirror_region_size = 512 mirror_log_fault_policy = "allocate" mirror_device_fault_policy = "remove" } That's all ;-) Thanks in advance -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster