Re: LIO iscsi connections issues with ESXi

Govindarajan <govind.rajan@xxxxxxxxx> · Tue, 24 Mar 2015 05:52:05 -0700



Hi Nicholas,
    Thanks for the detailed explanation. I have now set the default
command depth of all tpgt to 16. I will now monitor the connection
after starting heavy workload. I have also observed that the Intel
3500s performing the ducty of cache in bcache exhibit latencies of
more than 70ms for 64k (or slightly larger) writes. This causes ripple
effects all the way up the stack. I might change these 3500s to 3700s.

Regards,
Govind


On Fri, Mar 20, 2015 at 12:52 PM, Nicholas A. Bellinger
<nab@xxxxxxxxxxxxxxx> wrote:
> Hi Govind,
>
> Adding Steve to the CC', as he was seeing a similar issue with ESX.
>
> On Thu, 2015-03-12 at 17:05 -0700, Govindarajan wrote:
>> Hi,
>>     I have been using LIO target connected to 4 ESXi hosts for testing
>> purposes. On the ESXi hosts I have a total of about 30 virtual
>> machines. Whenever I run heavy IO workload in the VMs for some tests
>> ESXi seems to flip-flop between marking the iscsi connection online
>> and offline. This can be seen in the log snippet given at the bottom.
>>
>> A brief note about the set up.
>>
>> Ubuntu installed on a supermicro box.
>> 8 x 3TB disks in raid10 /dev/md0
>> 4 x 480GB SSD raid10 /dev/md1
>> bcache0 created out of /dev/md0 and /dev/md1 (cache)
>> LVMs created on top of bcache0
>> 10 Gbps Intel 82599ES NIC
>> 10 Gbps on all ESXi hosts for iscsi portgroups
>> switch(dell powerconnect), esxi, and linux enabled for jumbo frames
>>
>> Every time I see the connection flap in ESXi I see a lot of task
>> aborts being sent by lio target. I have looked around in the mailing
>> list archives but I am not able understand the root cause of the
>> problem. If I restart target service, things are back in order until I
>> run heavy IO again. I have taken care of switch related settings as
>> per Dell's whitepaper meant for ESXi best practices.
>>
>> Today I increased the MaxConnections param to 2, not sure if it would help.
>>
>
> MaxConnections controls the number of connections per session with MC/S.
>
> Since ESX does not support MC/S, setting this to > 1 is a nop.
>
>> Any clues are welcome that could help me root cause and solve this
>> issue. Please do let me know if you need any other log files from
>> either the linux storage box or the esxi hosts, I'll pull them out for
>> you.
>>
>
> <SNIP>
>
>>
>> Ubuntu kern.log around Mar 11 18:53 PST
>>
>> Mar 11 18:53:06 storage1 kernel: [1059266.165587] TARGET_CORE[iSCSI]: Detected NON_EXISTENT_LUN Access for 0x000000c0
>> Mar 11 18:53:06 storage1 kernel: [1059266.168159] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 85016
>> Mar 11 18:53:09 storage1 kernel: [1059269.575472] TARGET_CORE[iSCSI]: Detected NON_EXISTENT_LUN Access for 0x00000007
>> Mar 11 18:53:13 storage1 kernel: [1059273.356580] TARGET_CORE[iSCSI]: Detected NON_EXISTENT_LUN Access for 0x000000cb
>> Mar 11 18:53:13 storage1 kernel: [1059273.359017] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 85980
>> Mar 11 18:53:19 storage1 kernel: [1059279.763943] TARGET_CORE[iSCSI]: Detected NON_EXISTENT_LUN Access for 0x000000ca
>> Mar 11 18:53:19 storage1 kernel: [1059279.766294] ABORT_TASK: Found referenced iSCSI task_tag: 85020
>> Mar 11 18:53:19 storage1 kernel: [1059279.766298] ABORT_TASK: ref_tag: 85020 already complete, skipping
>> Mar 11 18:53:19 storage1 kernel: [1059279.766300] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 85020
>> Mar 11 18:53:23 storage1 kernel: [1059283.189916] TARGET_CORE[iSCSI]: Detected NON_EXISTENT_LUN Access for 0x00000011
>> Mar 11 18:53:23 storage1 kernel: [1059283.192360] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 41743929
>> Mar 11 18:53:23 storage1 kernel: [1059283.192366] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 41743928
>> Mar 11 18:53:26 storage1 kernel: [1059286.968498] TARGET_CORE[iSCSI]: Detected NON_EXISTENT_LUN Access for 0x000000d5
>> Mar 11 18:53:33 storage1 kernel: [1059293.343955] TARGET_CORE[iSCSI]: Detected NON_EXISTENT_LUN Access for 0x000000d3
>> Mar 11 18:53:33 storage1 kernel: [1059293.346318] ABORT_TASK: Found referenced iSCSI task_tag: 85024
>> Mar 11 18:53:33 storage1 kernel: [1059293.346321] ABORT_TASK: ref_tag: 85024 already complete, skipping
>> Mar 11 18:53:33 storage1 kernel: [1059293.346323] Unexpected ret: -32 send data 48
>>
>
> So ABORT_TASKs being generated by an ESX host typically means one of
> three things:
>
> 1) The backend storage is not fast enough to keep up with the workload.
>
> This can happen if the backend is not completing I/Os before ESX's
> internal SCSI timeout fires.  With ESX v5.x, the SCSI command timeout is
> 5000 ms (5 seconds), and IIRC for iSCSI can't be changed.
>
> Based upon your log above, there is a mix of ABORT_TASKs for commands
> that have already been completed, but acknowledged / not acknowledged.
>
> 2) The default_cmdsn_depth per iscsi endpoint is too large.
>
> By default starting with v3.12 code, the default_cmdsn_depth is 64 (eg:
> the number of outstanding commands that can be in flight at a given time
> per session).  This is configured on a per TargetName+TPGT context
> basis, or a per NodeACL context basis.
>
> I'd recommend trying a lower default_cmdsn_depth (say 16 or 8 or lower),
> in order to limit the amount of outstanding commands ESX can keep in
> flight at a given time.  Note that you'll need to restart the session in
> order for the changes to take effect.
>
> 3) There are too many LUNs on a single target export, causing the ESX
> initiator to hit internal false positive timeouts.
>
> ESX has a known issue that if too many LUNs are exported on a single
> TargetName+TPGT endpoint (say > 8 LUNs per endpoint), it will begin to
> hit false positive timeouts internally, due to scheduling fairness
> issues within the ESX SCSI host subsystem.
>
> For 10 Gb/sec ports, I'd recommend keeping <= 4 LUNs per target endpoint
> in order to avoid these types of false positives.  This can also depend
> on how many total TargetName+TPGT endpoints have been configured.
>
> HTH.
>
> --nab
>
--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html