Re: LIO iscsi connections issues with ESXi

"Nicholas A. Bellinger" <nab@xxxxxxxxxxxxxxx> · Fri, 20 Mar 2015 12:52:56 -0700

Hi Govind,

Adding Steve to the CC', as he was seeing a similar issue with ESX.

On Thu, 2015-03-12 at 17:05 -0700, Govindarajan wrote:
> Hi,
>     I have been using LIO target connected to 4 ESXi hosts for testing
> purposes. On the ESXi hosts I have a total of about 30 virtual
> machines. Whenever I run heavy IO workload in the VMs for some tests
> ESXi seems to flip-flop between marking the iscsi connection online
> and offline. This can be seen in the log snippet given at the bottom.
> 
> A brief note about the set up.
> 
> Ubuntu installed on a supermicro box.
> 8 x 3TB disks in raid10 /dev/md0
> 4 x 480GB SSD raid10 /dev/md1
> bcache0 created out of /dev/md0 and /dev/md1 (cache)
> LVMs created on top of bcache0
> 10 Gbps Intel 82599ES NIC
> 10 Gbps on all ESXi hosts for iscsi portgroups
> switch(dell powerconnect), esxi, and linux enabled for jumbo frames
> 
> Every time I see the connection flap in ESXi I see a lot of task
> aborts being sent by lio target. I have looked around in the mailing
> list archives but I am not able understand the root cause of the
> problem. If I restart target service, things are back in order until I
> run heavy IO again. I have taken care of switch related settings as
> per Dell's whitepaper meant for ESXi best practices.
> 
> Today I increased the MaxConnections param to 2, not sure if it would help.
> 

MaxConnections controls the number of connections per session with MC/S.

Since ESX does not support MC/S, setting this to > 1 is a nop.

> Any clues are welcome that could help me root cause and solve this
> issue. Please do let me know if you need any other log files from
> either the linux storage box or the esxi hosts, I'll pull them out for
> you.
> 

<SNIP>

> 
> Ubuntu kern.log around Mar 11 18:53 PST
> 
> Mar 11 18:53:06 storage1 kernel: [1059266.165587] TARGET_CORE[iSCSI]: Detected NON_EXISTENT_LUN Access for 0x000000c0
> Mar 11 18:53:06 storage1 kernel: [1059266.168159] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 85016
> Mar 11 18:53:09 storage1 kernel: [1059269.575472] TARGET_CORE[iSCSI]: Detected NON_EXISTENT_LUN Access for 0x00000007
> Mar 11 18:53:13 storage1 kernel: [1059273.356580] TARGET_CORE[iSCSI]: Detected NON_EXISTENT_LUN Access for 0x000000cb
> Mar 11 18:53:13 storage1 kernel: [1059273.359017] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 85980
> Mar 11 18:53:19 storage1 kernel: [1059279.763943] TARGET_CORE[iSCSI]: Detected NON_EXISTENT_LUN Access for 0x000000ca
> Mar 11 18:53:19 storage1 kernel: [1059279.766294] ABORT_TASK: Found referenced iSCSI task_tag: 85020
> Mar 11 18:53:19 storage1 kernel: [1059279.766298] ABORT_TASK: ref_tag: 85020 already complete, skipping
> Mar 11 18:53:19 storage1 kernel: [1059279.766300] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 85020
> Mar 11 18:53:23 storage1 kernel: [1059283.189916] TARGET_CORE[iSCSI]: Detected NON_EXISTENT_LUN Access for 0x00000011
> Mar 11 18:53:23 storage1 kernel: [1059283.192360] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 41743929
> Mar 11 18:53:23 storage1 kernel: [1059283.192366] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 41743928
> Mar 11 18:53:26 storage1 kernel: [1059286.968498] TARGET_CORE[iSCSI]: Detected NON_EXISTENT_LUN Access for 0x000000d5
> Mar 11 18:53:33 storage1 kernel: [1059293.343955] TARGET_CORE[iSCSI]: Detected NON_EXISTENT_LUN Access for 0x000000d3
> Mar 11 18:53:33 storage1 kernel: [1059293.346318] ABORT_TASK: Found referenced iSCSI task_tag: 85024
> Mar 11 18:53:33 storage1 kernel: [1059293.346321] ABORT_TASK: ref_tag: 85024 already complete, skipping
> Mar 11 18:53:33 storage1 kernel: [1059293.346323] Unexpected ret: -32 send data 48
> 

So ABORT_TASKs being generated by an ESX host typically means one of
three things:

1) The backend storage is not fast enough to keep up with the workload.

This can happen if the backend is not completing I/Os before ESX's
internal SCSI timeout fires.  With ESX v5.x, the SCSI command timeout is
5000 ms (5 seconds), and IIRC for iSCSI can't be changed.

Based upon your log above, there is a mix of ABORT_TASKs for commands
that have already been completed, but acknowledged / not acknowledged.

2) The default_cmdsn_depth per iscsi endpoint is too large.

By default starting with v3.12 code, the default_cmdsn_depth is 64 (eg:
the number of outstanding commands that can be in flight at a given time
per session).  This is configured on a per TargetName+TPGT context
basis, or a per NodeACL context basis.

I'd recommend trying a lower default_cmdsn_depth (say 16 or 8 or lower),
in order to limit the amount of outstanding commands ESX can keep in
flight at a given time.  Note that you'll need to restart the session in
order for the changes to take effect.

3) There are too many LUNs on a single target export, causing the ESX
initiator to hit internal false positive timeouts.

ESX has a known issue that if too many LUNs are exported on a single
TargetName+TPGT endpoint (say > 8 LUNs per endpoint), it will begin to
hit false positive timeouts internally, due to scheduling fairness
issues within the ESX SCSI host subsystem.

For 10 Gb/sec ports, I'd recommend keeping <= 4 LUNs per target endpoint
in order to avoid these types of false positives.  This can also depend
on how many total TargetName+TPGT endpoints have been configured. 

HTH.

--nab

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html