Hi all, Is this a bug? Should we report it on official RHN (I hate that slow buggy oracle based portal!) Summary: We have 2-node cluster on HP ProLiant DL 380 G5 servers. 3 services in cluster: - FreeRADIUS + IP addr - Apache + IP addr + storage LUN - Postgres + IP addr + storage LUN Fencing is done via HP ILO cards. Couple days ago, both power supplies on one node died in short time (well, obviously it can happen). Fenced daemon, ccsd, and cluster generaly didn't reacted well on that, despite surviving non-real-life acceptance tests where we pulled both power supplies out in test. Faulty power supply is something different than missing power supply for HP ILO card. ILO card continued to work on it's internal battery but "POWER ON" action did not suceeded (POWER command was returning that power is off). This situation has confused fence_ilo agent. Agent has seen that other server is down, but it never returned sucess to cluster because it FAILED TO POWER ON other server. I think this is buggy behaviour. Who cares if fence agent cannot power on again fenced node, why it just didn't give up? Here is relevant part of the log on healthy node which tried to fence other node. Dec 10 03:37:14 aoc01 kernel: CMAN: removing node aoc02 from the cluster : Missed too many heartbeats Dec 10 03:37:14 aoc01 fenced[3012]: aoc02 not a cluster member after 0 sec post_fail_delay Dec 10 03:37:14 aoc01 fenced[3012]: fencing node "aoc02" Dec 10 03:37:50 aoc01 fenced[3012]: agent "fence_ilo" reports: failed to turn on Dec 10 03:37:50 aoc01 fenced[3012]: fence "aoc02" failed Dec 10 03:37:55 aoc01 fenced[3012]: fencing node "aoc02" Dec 10 03:37:55 aoc01 ccsd[2896]: process_get: Invalid connection descriptor received. Dec 10 03:37:55 aoc01 ccsd[2896]: Error while processing get: Invalid request descriptor Dec 10 03:37:55 aoc01 fenced[3012]: fence "aoc02" failed Dec 10 03:38:00 aoc01 fenced[3012]: fencing node "aoc02" Dec 10 03:38:00 aoc01 ccsd[2896]: process_get: Invalid connection descriptor received. Dec 10 03:38:00 aoc01 ccsd[2896]: Error while processing get: Invalid request descriptor Dec 10 05:42:13 aoc01 fenced[3012]: fence "aoc02" failed Dec 10 05:42:18 aoc01 fenced[3012]: fencing node "aoc02" Dec 10 05:42:18 aoc01 ccsd[2896]: process_get: Invalid connection descriptor received. Dec 10 05:42:18 aoc01 ccsd[2896]: Error while processing get: Invalid request descriptor Dec 10 05:42:18 aoc01 fenced[3012]: fence "aoc02" failed Dec 10 05:42:23 aoc01 fenced[3012]: fencing node "aoc02" Dec 10 05:42:23 aoc01 ccsd[2896]: process_get: Invalid connection descriptor received. Dec 10 05:42:23 aoc01 ccsd[2896]: Error while processing get: Invalid request descriptor Dec 10 05:42:23 aoc01 fenced[3012]: fence "aoc02" failed Dec 10 05:42:28 aoc01 fenced[3012]: fencing node "aoc02" Dec 10 05:42:28 aoc01 ccsd[2896]: process_get: Invalid connection descriptor received. Dec 10 05:42:28 aoc01 ccsd[2896]: Error while processing get: Invalid request descriptor -- Miroslav -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster