Corey, Hi Corey I was talking about a watchdog not a kernel panic (sysreq...), on common (X86) hardware, most server vendors implement embedded hardware chips that could be used. Indeed, SCSI3 reservation/registration could be combined to this whole stuff to be sure about the nodes sanity. I think the choice should be given to the admin to adopt or not the paranoid approach of not failing over the services. 2010/3/4 Corey Kovacs <corey.kovacs@xxxxxxxxx>: > Brem, > > It's been my understanding that the kernel panic technique you are > describing essentially is undesirable for the fact that the kernel is in an > unknown state. Basically anything can happen. The OS doesn't have to do a > sync for an hba do flush etc. Since RedHat isn't in the business of building > there own hardware like HP(DEC), Sun, IBM, they take the next best route to > ensure that nothing from that problematic machine can affect the storage and > the only way to guarantee that is to remove power from the whole machine. > > VMS and Tru64 use the panic method but the other nodes will issue a > reservation on the scsi bus against that node to protect the storage. They > can do that because they know exactly how there hardware and implementation > of reservations work. > > Corey > > On Thu, Mar 4, 2010 at 5:32 AM, שלום קלמר <sklemer@xxxxxxxxx> wrote: >> >> Thanks to all !!!! >> >> Shalom.klemer@xxxxxx >> >> On Thu, Mar 4, 2010 at 12:00 AM, Lon Hohberger <lhh@xxxxxxxxxx> wrote: >>> >>> On Wed, 2010-03-03 at 13:10 +0200, שלום קלמר wrote: >>> > Hi. >>> > >>> > I got 2 power supplies. But if someone by mistake pull the power >>> > cables , is that mean >>> > >>> > That the services will not failover ?? >>> >>> The problem is: >>> >>> no power = no ping + no DRAC access >>> no network = no ping, no DRAC access >>> >>> If there's no power, then it is safe to fail over. >>> >>> If there is no network (and power is OK), then it is not safe to fail >>> over. Failover in this case is very likely to produce data corruption! >>> >>> Because we can not tell which case happened, we do not fail over. >>> >>> -- Lon >>> >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster@xxxxxxxxxx >>> https://www.redhat.com/mailman/listinfo/linux-cluster >> >> -- >> Linux-cluster mailing list >> Linux-cluster@xxxxxxxxxx >> https://www.redhat.com/mailman/listinfo/linux-cluster > > > -- > Linux-cluster mailing list > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster > -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster