LIO iscsi connections issues with ESXi

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,
    I have been using LIO target connected to 4 ESXi hosts for testing
purposes. On the ESXi hosts I have a total of about 30 virtual
machines. Whenever I run heavy IO workload in the VMs for some tests
ESXi seems to flip-flop between marking the iscsi connection online
and offline. This can be seen in the log snippet given at the bottom.

A brief note about the set up.

Ubuntu installed on a supermicro box.
8 x 3TB disks in raid10 /dev/md0
4 x 480GB SSD raid10 /dev/md1
bcache0 created out of /dev/md0 and /dev/md1 (cache)
LVMs created on top of bcache0
10 Gbps Intel 82599ES NIC
10 Gbps on all ESXi hosts for iscsi portgroups
switch(dell powerconnect), esxi, and linux enabled for jumbo frames

Every time I see the connection flap in ESXi I see a lot of task
aborts being sent by lio target. I have looked around in the mailing
list archives but I am not able understand the root cause of the
problem. If I restart target service, things are back in order until I
run heavy IO again. I have taken care of switch related settings as
per Dell's whitepaper meant for ESXi best practices.

Today I increased the MaxConnections param to 2, not sure if it would help.

Any clues are welcome that could help me root cause and solve this
issue. Please do let me know if you need any other log files from
either the linux storage box or the esxi hosts, I'll pull them out for
you.

Thanks,
Govind



targetcli output

o- / .........................................................................................................................
[...]
  o- backstores
..............................................................................................................
[...]
  | o- fileio ...................................................................................................
[0 Storage Object]
  | o- iblock ..................................................................................................
[2 Storage Objects]
  | | o- iscsi-vol1
................................................................................
[/dev/vg1/iscsi-vol1 activated]
  | | o- iscsi-vol2
................................................................................
[/dev/vg1/iscsi-vol2 activated]
  | o- pscsi ....................................................................................................
[0 Storage Object]
  | o- rd_dr ....................................................................................................
[0 Storage Object]
  | o- rd_mcp ...................................................................................................
[0 Storage Object]
  o- ib_srpt ...........................................................................................................
[0 Targets]
  o- iscsi .............................................................................................................
[4 Targets]
  | o- iqn.2003-01.org.linux-iscsi.storage1.x8664:sn.0ab22395332f
.......................................................... [1 TPG]
  | | o- tpgt1 ...........................................................................................................
[enabled]
  | |   o- acls
............................................................................................................
[1 ACL]
  | |   | o- iqn.1998-01.com.vmware:19-577fd24d
....................................................................
[2 Mapped LUNs]
  | |   |   o- mapped_lun1
.............................................................................................
[lun1 (rw)]
  | |   |   o- mapped_lun2
.............................................................................................
[lun2 (rw)]
  | |   o- luns
...........................................................................................................
[2 LUNs]
  | |   | o- lun1
........................................................................
[iblock/iscsi-vol1 (/dev/vg1/iscsi-vol1)]
  | |   | o- lun2
........................................................................
[iblock/iscsi-vol2 (/dev/vg1/iscsi-vol2)]
  | |   o- portals
......................................................................................................
[1 Portal]
  | |     o- 10.12.9.249:3260
..................................................................................
[OK, iser disabled]
  | o- iqn.2003-01.org.linux-iscsi.storage1.x8664:sn.1e0718367802
.......................................................... [1 TPG]
  | | o- tpgt1 ...........................................................................................................
[enabled]
  | |   o- acls
............................................................................................................
[1 ACL]
  | |   | o- iqn.1998-01.com.vmware:14-6c077dd9
....................................................................
[2 Mapped LUNs]
  | |   |   o- mapped_lun1
.............................................................................................
[lun1 (rw)]
  | |   |   o- mapped_lun2
.............................................................................................
[lun2 (rw)]
  | |   o- luns
...........................................................................................................
[2 LUNs]
  | |   | o- lun1
........................................................................
[iblock/iscsi-vol1 (/dev/vg1/iscsi-vol1)]
  | |   | o- lun2
........................................................................
[iblock/iscsi-vol2 (/dev/vg1/iscsi-vol2)]
  | |   o- portals
......................................................................................................
[1 Portal]
  | |     o- 10.12.9.249:3260
..................................................................................
[OK, iser disabled]
  | o- iqn.2003-01.org.linux-iscsi.storage1.x8664:sn.4d381eea71d4
.......................................................... [1 TPG]
  | | o- tpgt1 ...........................................................................................................
[enabled]
  | |   o- acls
............................................................................................................
[1 ACL]
  | |   | o- iqn.1998-01.com.vmware:16
.............................................................................
[2 Mapped LUNs]
  | |   |   o- mapped_lun1
.............................................................................................
[lun1 (rw)]
  | |   |   o- mapped_lun2
.............................................................................................
[lun2 (rw)]
  | |   o- luns
...........................................................................................................
[2 LUNs]
  | |   | o- lun1
........................................................................
[iblock/iscsi-vol1 (/dev/vg1/iscsi-vol1)]
  | |   | o- lun2
........................................................................
[iblock/iscsi-vol2 (/dev/vg1/iscsi-vol2)]
  | |   o- portals
......................................................................................................
[1 Portal]
  | |     o- 10.12.9.249:3260
..................................................................................
[OK, iser disabled]
  | o- iqn.2003-01.org.linux-iscsi.storage1.x8664:sn.8428b87f25ae
.......................................................... [1 TPG]
  |   o- tpgt1 ...........................................................................................................
[enabled]
  |     o- acls
............................................................................................................
[1 ACL]
  |     | o- iqn.1998-01.com.vmware:17-25c02de7
....................................................................
[2 Mapped LUNs]
  |     |   o- mapped_lun1
.............................................................................................
[lun1 (rw)]
  |     |   o- mapped_lun2
.............................................................................................
[lun2 (rw)]
  |     o- luns
...........................................................................................................
[2 LUNs]
  |     | o- lun1
........................................................................
[iblock/iscsi-vol1 (/dev/vg1/iscsi-vol1)]
  |     | o- lun2
........................................................................
[iblock/iscsi-vol2 (/dev/vg1/iscsi-vol2)]
  |     o- portals
......................................................................................................
[1 Portal]
  |       o- 10.12.9.249:3260
..................................................................................
[OK, iser disabled]
  o- loopback ..........................................................................................................
[0 Targets]
  o- qla2xxx ...........................................................................................................
[0 Targets]
  o- tcm_fc ............................................................................................................
[0 Targets]


Ubuntu kern.log around Mar 11 18:53 PST

Mar 11 18:53:06 storage1 kernel: [1059266.165587] TARGET_CORE[iSCSI]:
Detected NON_EXISTENT_LUN Access for 0x000000c0
Mar 11 18:53:06 storage1 kernel: [1059266.168159] ABORT_TASK: Sending
TMR_TASK_DOES_NOT_EXIST for ref_tag: 85016
Mar 11 18:53:09 storage1 kernel: [1059269.575472] TARGET_CORE[iSCSI]:
Detected NON_EXISTENT_LUN Access for 0x00000007
Mar 11 18:53:13 storage1 kernel: [1059273.356580] TARGET_CORE[iSCSI]:
Detected NON_EXISTENT_LUN Access for 0x000000cb
Mar 11 18:53:13 storage1 kernel: [1059273.359017] ABORT_TASK: Sending
TMR_TASK_DOES_NOT_EXIST for ref_tag: 85980
Mar 11 18:53:19 storage1 kernel: [1059279.763943] TARGET_CORE[iSCSI]:
Detected NON_EXISTENT_LUN Access for 0x000000ca
Mar 11 18:53:19 storage1 kernel: [1059279.766294] ABORT_TASK: Found
referenced iSCSI task_tag: 85020
Mar 11 18:53:19 storage1 kernel: [1059279.766298] ABORT_TASK: ref_tag:
85020 already complete, skipping
Mar 11 18:53:19 storage1 kernel: [1059279.766300] ABORT_TASK: Sending
TMR_TASK_DOES_NOT_EXIST for ref_tag: 85020
Mar 11 18:53:23 storage1 kernel: [1059283.189916] TARGET_CORE[iSCSI]:
Detected NON_EXISTENT_LUN Access for 0x00000011
Mar 11 18:53:23 storage1 kernel: [1059283.192360] ABORT_TASK: Sending
TMR_TASK_DOES_NOT_EXIST for ref_tag: 41743929
Mar 11 18:53:23 storage1 kernel: [1059283.192366] ABORT_TASK: Sending
TMR_TASK_DOES_NOT_EXIST for ref_tag: 41743928
Mar 11 18:53:26 storage1 kernel: [1059286.968498] TARGET_CORE[iSCSI]:
Detected NON_EXISTENT_LUN Access for 0x000000d5
Mar 11 18:53:33 storage1 kernel: [1059293.343955] TARGET_CORE[iSCSI]:
Detected NON_EXISTENT_LUN Access for 0x000000d3
Mar 11 18:53:33 storage1 kernel: [1059293.346318] ABORT_TASK: Found
referenced iSCSI task_tag: 85024
Mar 11 18:53:33 storage1 kernel: [1059293.346321] ABORT_TASK: ref_tag:
85024 already complete, skipping
Mar 11 18:53:33 storage1 kernel: [1059293.346323] Unexpected ret: -32
send data 48


ESXi log around the same time (Note that the times are in UTC)

2015-03-12T01:53:07.765Z cpu9:33555)WARNING: iscsi_vmk:
iscsivmk_StopConnection: vmhba38:CH:0 T:3 CN:0: iSCSI connection is
being marked "OFFLINE" (Event:4)
2015-03-12T01:53:07.765Z cpu9:33555)WARNING: iscsi_vmk:
iscsivmk_StopConnection: Sess [ISID: 00023d000001 TARGET:
iqn.2003-01.org.linux-iscsi.storage1.x8664:sn.0ab22395332f TPGT: 1
TSIH: 0]
2015-03-12T01:53:07.765Z cpu9:33555)WARNING: iscsi_vmk:
iscsivmk_StopConnection: Conn [CID: 0 L: 10.12.9.19:27631 R:
10.12.9.249:3260]
2015-03-12T01:53:07.765Z cpu9:33555)WARNING: iscsi_vmk:
iscsivmk_TaskMgmtIssue: vmhba38:CH:0 T:3 L:2 : Task mgmt "Abort Task"
with itt=0x14c15 (refITT=0x14c14) timed out.
2015-03-12T01:53:07.765Z cpu0:33557)WARNING: NMP:
nmp_DeviceRequestFastDeviceProbe:237: NMP device
"naa.6001405da598aed29c44cc781369ffad" state in doubt; requested fast
path state update...
2015-03-12T01:53:07.880Z cpu0:33369)NMP:
nmp_ThrottleLogForDevice:2322: Cmd 0x89 (0x413687e5ab00, 34877) to dev
"naa.6001405da598aed29c44cc781369ffad" on path "vmhba33:C0:T0:L1"
Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:EVAL
2015-03-12T01:53:08.025Z cpu0:33557)NMP:
nmp_ThrottleLogForDevice:2322: Cmd 0x89 (0x413687e5ab00, 34877) to dev
"naa.6001405da598aed29c44cc781369ffad" on path "vmhba33:C0:T0:L1"
Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:EVAL
2015-03-12T01:53:08.170Z cpu0:35670)NMP:
nmp_ThrottleLogForDevice:2322: Cmd 0x89 (0x413687e5ab00, 34877) to dev
"naa.6001405da598aed29c44cc781369ffad" on path "vmhba33:C0:T0:L1"
Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:EVAL
2015-03-12T01:53:08.311Z cpu0:33557)NMP:
nmp_ThrottleLogForDevice:2322: Cmd 0x89 (0x413687e5ab00, 34877) to dev
"naa.6001405da598aed29c44cc781369ffad" on path "vmhba33:C0:T0:L1"
Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:EVAL
2015-03-12T01:53:08.440Z cpu0:32795)WARNING: NMP:
nmp_DeviceRequestFastDeviceProbe:237: NMP device
"naa.6001405da598aed29c44cc781369ffad" state in doubt; requested fast
path state update...
--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux