tcm_node difficulties when used with pacemaker cluster, and suggested patches.

Steve Beaudry <Steve.Beaudry@xxxxxxxxxxxxx> · Thu, 6 Nov 2014 17:05:36 +0000

Good day folks,

   First off, please let me thank you for the fine work that you've done with the Linux-IO Target project.  We have used it as the foundation for our central SAN storage at Royal Roads University, supporting our VMWare ESX virtual infrastructure.  We make use of LIO for its iSCSI target, in combination with DRBD, LVM and Pacemaker, to create the redundant, replicated iSCSI stack described in http://www.linbit.com/en/downloads/tech-guides?download=9:highly-available-iscsi-with-drbd-and-pacemaker (I'm sure you're well familiar with it).

  We've gone well beyond the basic setup described in that document, and using LIO iSCSI in the cluster the way we are, we've encountered a few issues along the way, and a couple of them I thought were worth sending back to you, in the way of suggested patches.

   Both patches relate to issues caused when migrating iSCSI resources in the cluster.  I'm fairly sure that with without using a cluster manager to control the iSCSI targets/LUNS, these things would go unnoticed.  Still, when used within the cluster, the issues result in significant difficulty migrating resources from one host to another.

   The first issue is to do with starting multiple iSCSI targets/LUNs simultaneously.  In our configuration, we have up to four targets on a host (each with its own distinct IP address and IQN).  Each target has only a single LUN presented.  With multiple targets/LUNS running in the same cluster (only two hosts in the cluster), if one node fails, it forces the simultaneous startup of multiple target/LUNS on the other node.  Because the cluster manager simply executes multiple instances of 'tcm_node' to accomplish the work, there exists a situation where multi-threaded type problems can occur.  We're seeing one of these problems very consistently, which manifests itself in the following log entry:

2014-11-05T15:15:37.509489-08:00 capacity-3 iSCSILogicalUnit(lun0_vmCapacity-f)[7421]: [7717]: ERROR: Traceback (most recent call last):
                File "/usr/sbin/tcm_node", line 754, in <module> main()
                File "/usr/sbin/tcm_node", line 746, in main (options, args) = parser.parse_args()
                File "/usr/lib64/python2.7/optparse.py", line 1399, in parse_args stop = self._process_args(largs, rargs, values)
                File "/usr/lib64/python2.7/optparse.py", line 1439, in _process_args self._process_long_opt(rargs, values)
                File "/usr/lib64/python2.7/optparse.py", line 1514, in _process_long_opt option.process(opt, value, values, self)
                File "/usr/lib64/python2.7/optparse.py", line 788, in process self.action, self.dest, opt, value, values, parser)
                File "/usr/lib64/python2.7/optparse.py", line 808, in take_action self.callback(self, opt, value, parser, *args, **kwargs)
                File "/usr/sbin/tcm_node", line 726, in dispatcher orig_callback(*value)
                File "/usr/sbin/tcm_node", line 201, in tcm_createvirtdev os.mkdir(hba_full_path)
                OSError: [Errno 17] File exists: '/sys/kernel/config/target/core/iblock_0'

   Looking at the code, it's apparent that it must be a multi-threaded type problem occurring, with a thread creating a directory between a different thread's 'if directory doesn't exist' and 'create directory' calls (lines 200 and 201).  To combat this, I've changed the code to use a 'try/exception block', which simply ignores the error if (and only if) the error is 'directory already exists'.  This is contained in the first patch listed below. 

   
   The second issue is to do with attempting to stop a LUN using the cluster manager (which happens any time the resource needs to migrate from an existing, healthy node).  The cluster manager again calls the tcm_node script to perform the action.  The tcm_node script DOES in fact successfully stop the LUN, but it does so with an error status, and following error in the logs:

ERROR: Traceback (most recent call last): 
         File "/usr/sbin/tcm_node", line 754, in <module> main() 
File "/usr/sbin/tcm_node", line 746, in main (options, args) = parser.parse_args() 
File "/usr/lib64/python2.7/optparse.py", line 1399, in parse_args stop = self._process_args(largs, rargs, values) 
File "/usr/lib64/python2.7/optparse.py", line 1439, in _process_args self._process_long_opt(rargs, values) 
File "/usr/lib64/python2.7/optparse.py", line 1514, in _process_long_opt option.process(opt, value, values, self) 
File "/usr/lib64/python2.7/optparse.py", line 788, in process self.action, self.dest, opt, value, values, parser) 
File "/usr/lib64/python2.7/optparse.py", line 808, in take_action self.callback(self, opt, value, parser, *args, **kwargs) 
File "/usr/sbin/tcm_node", line 726, in dispatcher orig_callback(*value) 
File "/usr/sbin/tcm_node", line 333, in tcm_freevirtdev tcm_delete_aptpl_metadata(unit_serial) 
File "/usr/sbin/tcm_node", line 256, in tcm_delete_aptpl_metadata shutil.rmtree(aptpl_file) 
File "/usr/lib64/python2.7/shutil.py", line 237, in rmtree onerror(os.listdir, path, sys.exc_info()) 
File "/usr/lib64/python2.7/shutil.py", line 235, in rmtree names = os.listdir(path) 
OSError: [Errno 20] Not a directory: '/var/target/pr/aptpl_cadf532a'

   Looking at the code, I see that it's trying to use the rmtree function of shutil.py to remove the aptpl_metadata file.  The problem is, rmtree cannot accept a filename as it parameter.  I believe that this error was introduced in commit #8e2acc93cd5545b8bd46148957d3be2c23d878ee, where the code was updated from 'rm -rf' to 'rmtree'.  I believe it's been broken since then, but unless using a cluster manager or watching for the result codes, nobody has noticed, as the only side effect is leaving the stray aptpl_metadata file lying in the /var/target/pr/ directory.  To the cluster manager though, this results in a 'failed' action, which prevent clean and orderly resource migrations from occurring.  As shown in the second patch below, I believe the more appropriate call would be 'os.remove'.


   I mention these things respectfully recognizing that tcm_node is a part of lio_utils, which is listed as 'deprecated' in favour of targetcli.  Unfortunately, until someone updates the OCF resource agents 'Heartbeat::iSCSITarget' and 'Heartbeat::iSCSILogicalUnit', there are going to be people following published guides, and running into problems with their iSCSI cluster. 

Cheers, and again, thank you for the work you've done,

.Steve.


Stephen Beaudry, Manager
Server, Network and Telecom Infrastructures | Royal Roads University
T 250.391.2600 ext. 4149 
2005 Sooke Road, Victoria, BC  Canada  V9B 5Y2 | royalroads.ca
 
LIFE.CHANGING



________________________________________

--- tcm_node_orig       2014-11-05 19:24:34.973316251 -0800
+++ tcm_node    2014-11-06 01:32:24.677089841 -0800
@@ -198,7 +198,14 @@ def tcm_createvirtdev(dev_path, plugin_p
        # create hba if it doesn't exist
        hba_full_path = tcm_full_path(hba_path)
        if not os.path.isdir(hba_full_path):
-               os.mkdir(hba_full_path)
+                try:
+                        os.mkdir(hba_full_path)
+                except OSError, e:
+                        if e.errno == 17:
+                                # Suppress the file/dir exists error
+                                pass
+                        else:
+                                raise

        # create dev if it doesn't exist
        full_path = tcm_full_path(dev_path)

________________________________________



--- tcm_node_orig       2014-11-05 19:24:34.973316251 -0800
+++ tcm_node    2014-11-06 01:32:24.677089841 -0800
@@ -253,7 +260,7 @@ def tcm_delete_aptpl_metadata(unit_seria
        if not os.path.isfile(aptpl_file):
                return

-       shutil.rmtree(aptpl_file)
+       os.remove(aptpl_file)

def tcm_process_aptpl_metadata(dev_path):
        tcm_check_dev_exists(dev_path)

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html