Riaan van Niekerk wrote:
hi
We are trying to capture diskdumps when a lock_dlm kernel panic happens
and need to increase either post_fail_delay or deadnode_timeout to
prevent the dumping node from being fenced.
Is there any advantages or disadvantages to using either? Which is
recommended?
post_fail_delay and diskdump has come up previously, with some good
answers from David
http://www.redhat.com/archives/linux-cluster/2006-June/msg00037.html
note: for capturing a "sysrq t", we manually increase deadnode_timeout,
and decrease it back again, but don't have this luxury with a kernel
panic (which can happen at any time).
Riaan
Having spent some time researching this, and with some help from Red Hat
Support, here is an attempt at an answer. I use power-fencing. Some of
these might not apply to I/O fencing:
post_fail_delay
Pros:
- single place to change it (cluster.conf) makes it global across the
cluster
- If failed node is detected, resources will relocated immediately
(instead of waiting for the deadnode_timeout to be reached and then
relocate)
- usage case: post-kernel panic, when you need to capture a disk-/netdump
Cons:
- Fence daemon needs to be restarted to apply (e.g. in all likelihood
you need to reboot all nodes)
- Slight annoyance: depending on how long you set the post_fail_delay, a
node may be restarting already, and is then fenced, requiring another
restart.
deadnode_timeout
Pros:
- can be set dynamically
- useful if you have warning that the problem will materialize (we have
a scenario like that)
- usage case: when you need to run "sysrq t" or some intrusive command
which would cause a node to be fenced otherwise: Increase, sysrq, decrease
Cons:
- need to set on all nodes
- Not persistent. Need to hack cman init script to make persistent.
corrections/additions welcome
begin:vcard
fn:Riaan van Niekerk
n:van Niekerk;Riaan
org:Obsidian Systems;Obsidian Red Hat Consulting
email;internet:riaan@xxxxxxxxxxxxxx
title:Systems Architect
tel;work:+27 11 792 6500
tel;fax:+27 11 792 6522
tel;cell:+27 82 921 8768
x-mozilla-html:FALSE
url:http://www.obsidian.co.za
version:2.1
end:vcard
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster