Hi, I have been using LIO target connected to 4 ESXi hosts for testing purposes. On the ESXi hosts I have a total of about 30 virtual machines. Whenever I run heavy IO workload in the VMs for some tests ESXi seems to flip-flop between marking the iscsi connection online and offline. This can be seen in the log snippet given at the bottom. A brief note about the set up. Ubuntu installed on a supermicro box. 8 x 3TB disks in raid10 /dev/md0 4 x 480GB SSD raid10 /dev/md1 bcache0 created out of /dev/md0 and /dev/md1 (cache) LVMs created on top of bcache0 10 Gbps Intel 82599ES NIC 10 Gbps on all ESXi hosts for iscsi portgroups switch(dell powerconnect), esxi, and linux enabled for jumbo frames Every time I see the connection flap in ESXi I see a lot of task aborts being sent by lio target. I have looked around in the mailing list archives but I am not able understand the root cause of the problem. If I restart target service, things are back in order until I run heavy IO again. I have taken care of switch related settings as per Dell's whitepaper meant for ESXi best practices. Today I increased the MaxConnections param to 2, not sure if it would help. Any clues are welcome that could help me root cause and solve this issue. Please do let me know if you need any other log files from either the linux storage box or the esxi hosts, I'll pull them out for you. Thanks, Govind targetcli output o- / ......................................................................................................................... [...] o- backstores .............................................................................................................. [...] | o- fileio ................................................................................................... [0 Storage Object] | o- iblock .................................................................................................. [2 Storage Objects] | | o- iscsi-vol1 ................................................................................ [/dev/vg1/iscsi-vol1 activated] | | o- iscsi-vol2 ................................................................................ [/dev/vg1/iscsi-vol2 activated] | o- pscsi .................................................................................................... [0 Storage Object] | o- rd_dr .................................................................................................... [0 Storage Object] | o- rd_mcp ................................................................................................... [0 Storage Object] o- ib_srpt ........................................................................................................... [0 Targets] o- iscsi ............................................................................................................. [4 Targets] | o- iqn.2003-01.org.linux-iscsi.storage1.x8664:sn.0ab22395332f .......................................................... [1 TPG] | | o- tpgt1 ........................................................................................................... [enabled] | | o- acls ............................................................................................................ [1 ACL] | | | o- iqn.1998-01.com.vmware:19-577fd24d .................................................................... [2 Mapped LUNs] | | | o- mapped_lun1 ............................................................................................. [lun1 (rw)] | | | o- mapped_lun2 ............................................................................................. [lun2 (rw)] | | o- luns ........................................................................................................... [2 LUNs] | | | o- lun1 ........................................................................ [iblock/iscsi-vol1 (/dev/vg1/iscsi-vol1)] | | | o- lun2 ........................................................................ [iblock/iscsi-vol2 (/dev/vg1/iscsi-vol2)] | | o- portals ...................................................................................................... [1 Portal] | | o- 10.12.9.249:3260 .................................................................................. [OK, iser disabled] | o- iqn.2003-01.org.linux-iscsi.storage1.x8664:sn.1e0718367802 .......................................................... [1 TPG] | | o- tpgt1 ........................................................................................................... [enabled] | | o- acls ............................................................................................................ [1 ACL] | | | o- iqn.1998-01.com.vmware:14-6c077dd9 .................................................................... [2 Mapped LUNs] | | | o- mapped_lun1 ............................................................................................. [lun1 (rw)] | | | o- mapped_lun2 ............................................................................................. [lun2 (rw)] | | o- luns ........................................................................................................... [2 LUNs] | | | o- lun1 ........................................................................ [iblock/iscsi-vol1 (/dev/vg1/iscsi-vol1)] | | | o- lun2 ........................................................................ [iblock/iscsi-vol2 (/dev/vg1/iscsi-vol2)] | | o- portals ...................................................................................................... [1 Portal] | | o- 10.12.9.249:3260 .................................................................................. [OK, iser disabled] | o- iqn.2003-01.org.linux-iscsi.storage1.x8664:sn.4d381eea71d4 .......................................................... [1 TPG] | | o- tpgt1 ........................................................................................................... [enabled] | | o- acls ............................................................................................................ [1 ACL] | | | o- iqn.1998-01.com.vmware:16 ............................................................................. [2 Mapped LUNs] | | | o- mapped_lun1 ............................................................................................. [lun1 (rw)] | | | o- mapped_lun2 ............................................................................................. [lun2 (rw)] | | o- luns ........................................................................................................... [2 LUNs] | | | o- lun1 ........................................................................ [iblock/iscsi-vol1 (/dev/vg1/iscsi-vol1)] | | | o- lun2 ........................................................................ [iblock/iscsi-vol2 (/dev/vg1/iscsi-vol2)] | | o- portals ...................................................................................................... [1 Portal] | | o- 10.12.9.249:3260 .................................................................................. [OK, iser disabled] | o- iqn.2003-01.org.linux-iscsi.storage1.x8664:sn.8428b87f25ae .......................................................... [1 TPG] | o- tpgt1 ........................................................................................................... [enabled] | o- acls ............................................................................................................ [1 ACL] | | o- iqn.1998-01.com.vmware:17-25c02de7 .................................................................... [2 Mapped LUNs] | | o- mapped_lun1 ............................................................................................. [lun1 (rw)] | | o- mapped_lun2 ............................................................................................. [lun2 (rw)] | o- luns ........................................................................................................... [2 LUNs] | | o- lun1 ........................................................................ [iblock/iscsi-vol1 (/dev/vg1/iscsi-vol1)] | | o- lun2 ........................................................................ [iblock/iscsi-vol2 (/dev/vg1/iscsi-vol2)] | o- portals ...................................................................................................... [1 Portal] | o- 10.12.9.249:3260 .................................................................................. [OK, iser disabled] o- loopback .......................................................................................................... [0 Targets] o- qla2xxx ........................................................................................................... [0 Targets] o- tcm_fc ............................................................................................................ [0 Targets] Ubuntu kern.log around Mar 11 18:53 PST Mar 11 18:53:06 storage1 kernel: [1059266.165587] TARGET_CORE[iSCSI]: Detected NON_EXISTENT_LUN Access for 0x000000c0 Mar 11 18:53:06 storage1 kernel: [1059266.168159] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 85016 Mar 11 18:53:09 storage1 kernel: [1059269.575472] TARGET_CORE[iSCSI]: Detected NON_EXISTENT_LUN Access for 0x00000007 Mar 11 18:53:13 storage1 kernel: [1059273.356580] TARGET_CORE[iSCSI]: Detected NON_EXISTENT_LUN Access for 0x000000cb Mar 11 18:53:13 storage1 kernel: [1059273.359017] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 85980 Mar 11 18:53:19 storage1 kernel: [1059279.763943] TARGET_CORE[iSCSI]: Detected NON_EXISTENT_LUN Access for 0x000000ca Mar 11 18:53:19 storage1 kernel: [1059279.766294] ABORT_TASK: Found referenced iSCSI task_tag: 85020 Mar 11 18:53:19 storage1 kernel: [1059279.766298] ABORT_TASK: ref_tag: 85020 already complete, skipping Mar 11 18:53:19 storage1 kernel: [1059279.766300] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 85020 Mar 11 18:53:23 storage1 kernel: [1059283.189916] TARGET_CORE[iSCSI]: Detected NON_EXISTENT_LUN Access for 0x00000011 Mar 11 18:53:23 storage1 kernel: [1059283.192360] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 41743929 Mar 11 18:53:23 storage1 kernel: [1059283.192366] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 41743928 Mar 11 18:53:26 storage1 kernel: [1059286.968498] TARGET_CORE[iSCSI]: Detected NON_EXISTENT_LUN Access for 0x000000d5 Mar 11 18:53:33 storage1 kernel: [1059293.343955] TARGET_CORE[iSCSI]: Detected NON_EXISTENT_LUN Access for 0x000000d3 Mar 11 18:53:33 storage1 kernel: [1059293.346318] ABORT_TASK: Found referenced iSCSI task_tag: 85024 Mar 11 18:53:33 storage1 kernel: [1059293.346321] ABORT_TASK: ref_tag: 85024 already complete, skipping Mar 11 18:53:33 storage1 kernel: [1059293.346323] Unexpected ret: -32 send data 48 ESXi log around the same time (Note that the times are in UTC) 2015-03-12T01:53:07.765Z cpu9:33555)WARNING: iscsi_vmk: iscsivmk_StopConnection: vmhba38:CH:0 T:3 CN:0: iSCSI connection is being marked "OFFLINE" (Event:4) 2015-03-12T01:53:07.765Z cpu9:33555)WARNING: iscsi_vmk: iscsivmk_StopConnection: Sess [ISID: 00023d000001 TARGET: iqn.2003-01.org.linux-iscsi.storage1.x8664:sn.0ab22395332f TPGT: 1 TSIH: 0] 2015-03-12T01:53:07.765Z cpu9:33555)WARNING: iscsi_vmk: iscsivmk_StopConnection: Conn [CID: 0 L: 10.12.9.19:27631 R: 10.12.9.249:3260] 2015-03-12T01:53:07.765Z cpu9:33555)WARNING: iscsi_vmk: iscsivmk_TaskMgmtIssue: vmhba38:CH:0 T:3 L:2 : Task mgmt "Abort Task" with itt=0x14c15 (refITT=0x14c14) timed out. 2015-03-12T01:53:07.765Z cpu0:33557)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.6001405da598aed29c44cc781369ffad" state in doubt; requested fast path state update... 2015-03-12T01:53:07.880Z cpu0:33369)NMP: nmp_ThrottleLogForDevice:2322: Cmd 0x89 (0x413687e5ab00, 34877) to dev "naa.6001405da598aed29c44cc781369ffad" on path "vmhba33:C0:T0:L1" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:EVAL 2015-03-12T01:53:08.025Z cpu0:33557)NMP: nmp_ThrottleLogForDevice:2322: Cmd 0x89 (0x413687e5ab00, 34877) to dev "naa.6001405da598aed29c44cc781369ffad" on path "vmhba33:C0:T0:L1" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:EVAL 2015-03-12T01:53:08.170Z cpu0:35670)NMP: nmp_ThrottleLogForDevice:2322: Cmd 0x89 (0x413687e5ab00, 34877) to dev "naa.6001405da598aed29c44cc781369ffad" on path "vmhba33:C0:T0:L1" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:EVAL 2015-03-12T01:53:08.311Z cpu0:33557)NMP: nmp_ThrottleLogForDevice:2322: Cmd 0x89 (0x413687e5ab00, 34877) to dev "naa.6001405da598aed29c44cc781369ffad" on path "vmhba33:C0:T0:L1" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:EVAL 2015-03-12T01:53:08.440Z cpu0:32795)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.6001405da598aed29c44cc781369ffad" state in doubt; requested fast path state update... -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html