Re: Tripp Lite switched PDU fence agent; exists?

Digimer <linux@xxxxxxxxxxx> · Wed, 16 Mar 2011 15:57:45 -0400

On 03/16/2011 02:59 PM, bergman@xxxxxxxxxxxx wrote:
> The pithy ruminations from Digimer <linux@xxxxxxxxxxx> on " Tripp Lite switched PDU fence agent; exists?" were:
> 
> => Hi all,
> => 
> =>   Does anyone know if the tripp lite (mn: PDUMH15ATNET, specifically)
> => has an existing RHCS fence agent? Specifically for cluster 2 / EL5.5. If
> 
> Yes.
> 
> 
> => not, has anyone written one? Failing all that, I suppose I will write
> => one. :)
> => 
> 
> Yes.
> 
> I wrote an agent for that piece of hardware and offered the agent to the RHCS community in Nov 2008...there was no response at the time.[1]
> 
> In March, 2009, I sent a copy of the agent script to Jan Friesse <jfriesse@xxxxxxxxxx>, Marek Grac <mgrac@xxxxxxxxxx>, who were identified as the maintainers of all the fence agents.
> 
> Since it apparently hasn't made it into the RHCS distribution, let me know if you want a copy.
> 
> 
> Finally, I'd like to warn people away from using the TrippLite PDU model 
> PDUMH15ATNET as a fencing device. While it seems to have nice features, it has 
> a design choice that is a serious problem with fencing--when a command is 
> given to power down an outlet, there is a "random" delay (observed to be 
> about 17 to 35 seconds) before that command is executed. This has been 
> acknowledged by TrippLite support as a design choice, with no option or setting 
> to override this behavior.
>  
> Mark
> 
> 
> 	[1] http://www.redhat.com/archives/linux-cluster/2008-November/msg00215.html

Hi Mark,

  I came across your post in the archives, actually. :)

  I would like a copy of your agent, if you don't mind. I already
maintain another fence agent, and would be happy to maintain this one,
shy of someone more experience stepping up.

  As for the delay, that sounds annoying, but not insurmountable. I've
got one of the switches on order already, as I wanted to see how they
worked. I can fairly easily put in a 5-sec poll that checks the state
until the node is cut or a timeout is hit. From the cluster's point of
view, this is safe outside of delaying recovery. In my case though, I'll
be sure to use it as the secondary fence device. I'll include such a
warning/suggestion in the agent's man page as well.

Cheers

-- 
Digimer
E-Mail: digimer@xxxxxxxxxxx
AN!Whitepapers: http://alteeve.com
Node Assassin:  http://nodeassassin.org

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster