On Mon, Apr 14, 2008 at 10:53:23AM -0400, Vivek Goyal wrote: > On Mon, Apr 14, 2008 at 10:42:28AM -0400, Neil Horman wrote: > > On Mon, Apr 14, 2008 at 09:46:22AM -0400, Vivek Goyal wrote: > > > On Fri, Apr 11, 2008 at 09:07:51PM -0700, Andrew Morton wrote: > > > > > > [..] > > > > > Kernel panic - not syncing: Panic by panic_module. > > > > > __tunable_atomic_notifier_call_chain enter > > > > > msg_handler:panic_event was called. > > > > > ipmi_wdog:wdog_panic_handler was called. > > > > > notifier_test: notifier_test_panic() is called. > > > > > notifier_test: notifier_test_panic2() is called. > > > > > > > > OK. But I don't see anywhere in here the most important piece of > > > > information: why do we need this feature in Linux? > > > > > > > > What are the use-cases? What is the value? etc. > > > > > > > > Often I can guess (but I like the originator to remove the guesswork). In > > > > this case I'm stumped - I can't see any reason why anyone would want this. > > > > > > > > > > Hi Andrew, > > > > > > To begin with, he wants kdb, kgdb etc to co-exist with kdump. He wants > > > to put all the RAS tools (who are interested in panic event) on a list > > > and export it to user space and let user decide in what order do the tool get > > > executed at panic time (based on priority). > > > > > > This brings in little bit reliability concerns for kdump due to notifier > > > code being run after panic. > > > > > > I think people want to use this infrastrutucure beyond RAS tools. I > > > remember somebody wanting to send a message to remote node after a > > > panic (before kdump kicks in) so that remote node can initiate failover > > > etc. > > > > > I know it doesn't particularly relate to this patch, but FWIW, for cases like > > failover, I've inserted infrastrucutre in the userspace part of kdump for > > Fedora/RHEL to support this sort of thing. We can run arbitrary scripts righte > > before and after a capture so that notifications can be sent to remote nodes in > > a much safer fashion than using the notifier chain after a panic. > > Neil > > > > That's great. I did not know about these. So user can write custom > scripts/binaries which can be packed into kdump initrd and executed either > before or after dump capture? Any idea, if somebody has started using it > already? > Thats exactly right. I'm not sure if there is any serious use as of yet, but I've had some interrogatories about it. Specific cases that I recall include: 1) A set of users in japan that are using the pre-dump script to block execution until a scsi controller detects all its drives (it apparently takes up to three minues to scan its bus) 2) I think some people using clustering services were using the pre-script to notify cluster peers of the failure to avoid power fencing while a node completed the crash dump 3) A national lab had an interest in using the pre script to send an email to an administrative address to log the failure in a cluster Neil > If that's the case then only other serious user at this point of time > is kernel debugger (kdb, kgdb), which needs to run before kdump, in case > of panic. And Eric suggested for those cases debugger can just insert a > break point at panic(), instead of introducing the tunable notifier list > infrastructure. > > Thanks > Vivek -- /*************************************************** *Neil Horman *Software Engineer *Red Hat, Inc. *nhorman at redhat.com *gpg keyid: 1024D / 0x92A74FA1 *http://pgp.mit.edu ***************************************************/