Problem statement: Fencing fails if WTI device loses network connectivity. It sure would be Nice(tm) to have a backup method to access the fence device. Solution: With this patch, you can use your WTI RSM serial port server as a backup method to access your WTI IPS/NPS/NBB/TPS/. remote power controller in the event that the IPS/*/etc. loses its network connectivity (I'll just say "IPS" from now on to mean "all of 'em"). IMPORTANT: If you want it to work with your particular serial port server (that is not a WTI RSM), I welcome your patch or your hardware. ;) This patch was tested with an IPS-800 as the back-end and a RSM-8 as the front-end. It *should* work with any RSM and any supported WTI remote power switch (e.g. NPS, IPS, NBB, TPS series). Configuration notes: * You must enable "Direct Connect" access for the port connected to the IPS. (/P [number], option 31). It should be set to "On - Password". * The perl Net::Telnet module does not work with the RSM's standard telnet port; I don't know why (nor do I care to figure it out :) ). However, because of this, you must enable "Raw Socket Access" (/N, option 31); using the standard direct access telnet port will not work. * You may use a script to retrieve the password (use rsm_passwd_script option instead of rsm_passwd). * You must add a new fence device (manually) to the cluster.conf (example below). There is no UI support for this. ... <clusternode name="green" nodeid="2" votes="1"> <multicast addr="225.0.0.12" interface="eth0"/> <fence> <method name="1"> <device name="ips-rack9" port="2"/> </method> <method name="2"> <device name="ips-rack9-backup" port="2"/> </method> </fence> </clusternode> </clusternodes> <fencedevices> <fencedevice agent="fence_wti" name="ips-rack9" passwd="wti" ipaddr="ips-rack9"/> <fencedevice agent="fence_wti" name="ips-rack9-backup" passwd="wti" ipaddr="rsm-rack9" rsm_enable="1" rsm_login="super" rsm_passwd="super" tcpport="3108"/> </fencedevices> ... In my case, I plugged the IPS into port 8 on the RSM - so the raw port becomes '3108'. This reduces the complexity of the patch (when compared to parsing RSM output). Note that the passwd, agent, and port (plug number) stay the same, but the host we're talking to changes, and a bunch of additional stuff is added to talk to the RSM prior to even getting to the IPS. How to test: * Apply patch to CVS/head; rebuild / install fence_wti. * Configure your cluster similar to above (don't forget to run ccs_tool update) * Pull the network jack from your WTI power switch. * Run fence_node <nodename>. The first fence level will fail, but the second one should succeed. Other information: * This patch has been tested with the same RSM with direct access for the port set to 'no password' as well. In this case, the relevant extra fence device would more or less look like the following: ... <fencedevice agent="fence_wti" name="ips-rack9-backup" passwd="wti" ipaddr="rsm-rack9" serial="1" tcpport="3108"/> ... I do not recommend this configuration because it is possible to log out of the RSM without being logged out of the IPS. The effect is that someone might be able to reconnect to the existing IPS session without being prompted for any password (that's bad!). The upshot is that raw serial mode *MIGHT* work with other serial port server appliances which support raw / unpassworded / direct telnet access. FAQ: Q: What units was this patch developed on? A: WTI RSM-8 & WTI IPS-800 Q: What versions of Linux-cluster will this work with? A: Currently, just CVS - this is a new patch. Q: The RSM supports SSH - why is there no SSH support? A: Because it would increase the complexity of the fence_wti agent significantly. Patches accepted. Q: Why not just have two hosts (rsm_host, for example) and do the retrying internally? A: fenced already retries and already has provisions for doing this; implementing it within fence agents is redundant (not to mention, it adds needless complexity). Q: Why did you choose the WTI RSM? A: First, because people like solutions from as few vendors as possible - so, the RSM + IPS was a natural fit in that regard. Second, because it's what I have on-hand. Q: Why don't you support the [insert your favorite serial server here]? A: Because I don't have one and because it adds even more complexity to the agent. Taking patches (should amount to changing the login/password expect strings in the rsm-login section...). -- Lon
Index: fence_wti.pl =================================================================== RCS file: /cvs/cluster/cluster/fence/agents/wti/fence_wti.pl,v retrieving revision 1.7 diff -u -r1.7 fence_wti.pl --- fence_wti.pl 12 Feb 2007 20:17:08 -0000 1.7 +++ fence_wti.pl 15 Mar 2007 18:49:44 -0000 @@ -47,6 +47,14 @@ print " -o <operation> Operation to perform (on, off, reboot)\n"; print " -q quiet mode\n"; print " -T test reports state of plug (no power cycle)\n"; + print " -s Use raw serial pass-through (e.g. connect via\n"; + print " a serial port server)\n"; + print " -R Use WTI RSM pass-through (requires a separate\n"; + print " login and password)\n"; + print " -L RSM login\n"; + print " -X RSM password\n"; + print " -Z <path> Script to run to retrieve RSM login password\n"; + print " -P TCP port (if not 23)\n"; print " -V Version\n"; exit 0; @@ -76,9 +84,19 @@ exit 0; } +sub do_exit +{ + if (defined $opt_s || defined $opt_R) { + $t->put("/x,y\n"); + } + $t->close; + exit $1; +} + $opt_o = "reboot"; +$opt_P = "23"; if (@ARGV > 0) { - getopts("a:hn:p:S:qTVo:") || fail_usage ; + getopts("a:hn:p:S:qTVo:P:RsL:X:Z:") || fail_usage ; usage if defined $opt_h; version if defined $opt_V; @@ -112,14 +130,42 @@ } } + if (defined $opt_R) { + $pwd_script_out = `$opt_Z`; + chomp($pwd_script_out); + if ($pwd_script_out) { + $opt_X = $pwd_script_out; + } + fail "failed: no RSM password" unless defined $opt_X; + } + fail "failed: no password" unless defined $opt_p; } $t = new Net::Telnet; +#$t->input_log("input_log"); +#$t->output_log("output_log"); +#$t->dump_log("dump_log"); +$t->open(Host=>$opt_a, Port=>$opt_P); + +if (defined $opt_R) { + # RSM-8 serial terminal server intelligence + fail "failed: no RSM login" unless defined $opt_L; + fail "failed: no RSM password " unless defined $opt_X; + ($line, $match) = $t->waitfor("/login:/"); + $t->print($opt_L); + ($line, $match) = $t->waitfor("/Password:/"); + $t->print($opt_X); + ($line, $match) = $t->waitfor("/Connected/"); +} -$t->open($opt_a); +if (defined $opt_s || defined $opt_R) { + # print "Raw Serial: Send init <cr>\n"; + $t->put("\n"); +} -$expr = '/:|\n/'; + +$expr = '/:|\n|\>/'; while (1) { @@ -131,7 +177,6 @@ $t->print($opt_p); $expr = '/\n/'; } - elsif ($line =~ /v\d.\d+/) { $line =~ /\D*(\d)\.(\d+).*/; @@ -141,13 +186,18 @@ $t->waitfor('/(TPS|IPS|RPC|NPS|NBB)\>/'); last; } + elsif ($line =~ /(TPS|IPS|RPC|NPS|NBB)\>/) + { + # Already got one! + last; + } } if (defined $opt_T) { &test($t); - exit 0; + do_exit 0; } @@ -173,7 +223,7 @@ { print "failed: plug number \"$opt_n\" not found\n" unless defined $opt_q; - exit 1; + do_exit 1; } $line =~ /^\s+(\d+).*/; @@ -196,7 +246,7 @@ print "failed: plug not off ($state)\n" unless defined $opt_q; - exit 1; + do_exit 1; } } } @@ -225,7 +275,7 @@ { print "success: plug-on warning\n" unless defined $opt_q; - exit 0; + do_exit 0; } $line =~ /^\s+(\d+).*/; @@ -249,14 +299,14 @@ print "success: plug state warning ($state)\n" unless defined $opt_q; - exit 0; + do_exit 0; } } } print "success: $opt_o operation on plug $opt_n\n" unless defined $opt_q; -exit 0; +do_exit 0; @@ -274,7 +324,7 @@ { print "failed: plug number \"$opt_n\" not found\n" unless defined $opt_q; - exit 1; + do_exit 1; } $line =~ /^\s+(\d+).*/; @@ -349,6 +399,14 @@ { $opt_a = $val; } + elsif ($name eq "tcpport" ) + { + $opt_P = $val; + } + elsif ($name eq "serial" ) + { + $opt_s = $val; + } # FIXME -- depreicated residue of old fencing system elsif ($name eq "name" ) { } @@ -357,10 +415,10 @@ { $opt_p = $val; } - elsif ($name eq "passwd_script" ) - { - $opt_S = $val; - } + elsif ($name eq "passwd_script" ) + { + $opt_S = $val; + } elsif ($name eq "port" ) { $opt_n = $val; @@ -369,6 +427,23 @@ { $opt_o = $val; } + elsif ($name eq "rsm_enable" ) + { + $opt_R = $val; + } + elsif ($name eq "rsm_login" ) + { + $opt_L = $val; + } + elsif ($name eq "rsm_passwd" ) + { + $opt_X = $val; + } + elsif ($name eq "rsm_passwd_script" ) + { + $opt_Z = $val; + } + # elsif ($name eq "test" ) # { # $opt_T = $val;
-- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster