On Mon, Aug 28, 2006 at 03:18:24PM -0400, Zelikov_Mikhail@xxxxxxx wrote: > While the node is down (bof226) I do fence_ack_manual -n bof226. I start > getting the following messages in the /var/log/messages: > > Aug 28 15:08:30 bof227 fence_manual: Node bof226 needs to be reset > before recovery can procede. Waiting for bof226 to rejoin the cluster > or for manual acknowledgement that it has been reset (i.e. > fence_ack_manual -n bof226) > Aug 28 15:10:33 bof227 ccsd[2433]: process_get: Invalid connection > descriptor received. > Aug 28 15:10:33 bof227 ccsd[2433]: Error while processing get: Invalid > request descriptor > Aug 28 15:10:33 bof227 fenced[2497]: fence "bof226" failed Strange bug you've found, I've not seen those ccsd errors before. fence_manual doesn't use ccs, so I'm not sure how that's getting involved. > >>> Is there a special reason you're using both gnbd and manual fencing? > I've never seen that done before and can't think of a reason you'd want > to. > I was under impression that if there is no hw fencing device then the > manual one is required. It was also my understanding that if I use gnbd > devices then an explicit gnbd fencing is required as well. If you're using gnbd you have three separate options for fencing: - fence_gnbd, or - hardware fencing, or - fence_manual All three of them are independent of each other, none need to be combined, just pick one. We only recommend fence_manual when experimenting. fence_gnbd is a perfectly legitimate alternative to hw fencing. Dave -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster