Fujita-san et al, I am running into a situation where somewhat unexpected tgtd behavior seems to break failover for a MS Cluster Service cluster on Windows 2003 with Microsoft iSCSI initiator 2.08. Please consider the following scenario: - 4-node MSCS cluster is configured to use a tgtd-exported LU as its quorum device. - The active node in the cluster reserves that LU via a SCSI-2 reservation (RESERVE). - Manual failover (including the associated SCSI-2 RELEASE) has been established to work as expected. - The currently active node has its iSCSI connection broken by removal of the Ethernet cable. - After the Time2Retain expires, MSCS initiates automatic failover to one of the surviving peer nodes. - The node taking over now attempts to preempt the reservation on the quorum disk, by issuing a bus reset and then RESERVE. - The bus reset appears to fail. - Thus, the node taking over cannot RESERVE, and failover effectively breaks. Looking at what I think is the relevant code path in http://git.kernel.org/?p=linux/kernel/git/tomo/tgt.git;a=blob;f=usr/iscsi/iscsid.c#l1318, I see this: 1318 case ISCSI_TM_FUNC_LOGICAL_UNIT_RESET: 1319 fn = LOGICAL_UNIT_RESET; 1320 break; 1321 case ISCSI_TM_FUNC_TARGET_WARM_RESET: 1322 case ISCSI_TM_FUNC_TARGET_COLD_RESET: 1323 case ISCSI_TM_FUNC_TASK_REASSIGN: 1324 err = ISCSI_TMF_RSP_NOT_SUPPORTED; 1325 break; So it appears that if the initiator does not issue an LU reset, but instead a bus reset (and MSCS seems to exhibit that behavior, at least in Windows 2003), then this preemption method is bound to fail. Interestingly, ietd appears to handle this correctly. Again, I'm just guessing that this is the correct code path to look into, but in http://svn.berlios.de/wsvn/iscsitarget/trunk/kernel/iscsi.c, it seems that in execute_task_management(), ISCSI_FUNCTION_TARGET_WARM_RESET and ISCSI_FUNCTION_TARGET_COLD_RESET are mapped to the target_reset() function and thus handled correctly. Could you please explain whether my analysis of the situation is correct, and if so: what is the reason for not supporting target resets in tgtd, and what is it that precludes supporting it? Many thanks in advance, Florian
Attachment:
signature.asc
Description: OpenPGP digital signature