Re: target crashes with vSphere 6 hosts

"Nicholas A. Bellinger" <nab@xxxxxxxxxxxxxxx> · Wed, 27 Jan 2016 00:12:48 -0800

On Tue, 2016-01-26 at 09:11 -0500, Dan Lane wrote:
> Thanks for all the information Nicholas, there's only one part that I
> think is really in question - the idea that the backend can't keep up
> with the workload.  You may remember I had similar problems in the
> past that I contacted the mailing list for.  After talking about the
> problem quite a bit I decided I needed a better backend, so I built
> out this new system that should have absolutely no problem keeping up.
> I have 20x 10k enterprise raptor drive in RAID6 with an SSD acting as
> read and write cache via LSI Cachecade.
> 

> BTW, I get the "ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for
> ref_tag" error with vsphere5, but LIO doesn't crash in that case.
> Also, I seem to get the error regardless of the system load.
> 

To clarify, an ABORT_TASK means the default SCSI timeout threshold has
been reached by ESX.

What I meant earlier by 'if the backend is able to keep-up' is that you
should not be hitting constant ABORT_TASK during normal operation.
Otherwise your ESX Hosts are keeping too many I/Os in flight for the
default FC LLD SCSI timeout for what the backend storage is able to
process.

For example using iSCSI with ESX >= v5.x, the SCSI timeout is hard-coded
to 5 seconds.  Which means ABORT_TASKs will be generated when ESX iSCSI
does not receive a I/O response (from target) after this period, as long
as the iSCSI session I_T nexus (and associated ESX multipath state) is
still active.

With iSCSI, this queue depth is controlled with default_cmdsn_depth
value target side, or with explicit NodeACLs to define this on a per
initiator login basis. The ESX iSCSI initiator honors this limit for
limiting the outstanding I/O per sesssion, separate from some well known
ESX scheduling fairness issues with multi-LUN sessions.

However with FC and qla2xxx target mode, the only (target side) limit is
the hardcoded qla2xxx hw frame limit of ATIO_ENTRY_CNT_24XX=4096.

Which means (as a work-around) you can consider the following changing
the following for your ESX FC hosts driver:

  - lowering the default FC LLD host queue_depth, and/or
  - increasing FC LLD SCSI timeout

Last time I checked these settings where configurable for ESX FC LLD
with qla2xxx, but you'll still need to grok the actual parameter names
if you want to consider changing them.

> I really appreciate the time you put into the kernel update information!
> 

Most certainly.  I'll be updating the 4.4-stable branch with Quinn's
additional qla2xxx patch shortly.

> Thanks,
> Dan
> 
> On Tue, Jan 26, 2016 at 1:55 AM, Nicholas A. Bellinger
> <nab@xxxxxxxxxxxxxxx> wrote:
> > Hi Dan,
> >
> > (Adding Quinn + Giri CC')
> >
> > On Mon, 2016-01-25 at 16:08 -0500, Dan Lane wrote:
> >> Update: If it matters, I tried loading a host with ESXi 5.5 u3b and
> >> that also crashed the filer.
> >
> > Thanks for your bug report.
> >
> > Note this bug is not specific to ESXi 6.0, and the scenario occurs when
> > backend driver exports are unable to keep up with host workload,
> > resulting in internal ESX ABORT_TASK + LUN_RESET to trigger across
> > multiple local+remote ports.
> >
> > The following WIP series is for addressing this bug:
> >
> > http://www.spinics.net/lists/target-devel/msg11691.html
> >
> > I've been verifying this on iscsi-target exports over the last weeks,
> > and the specific bug your hitting is AFAICT not qla2xxx driver specific.
> >
> >>   Still waiting on an answer about the
> >> applying of upstream commits to an OS like fedora or any other ideas
> >> about the cause of this.
> >>
> >
> > So you'll want to build a v4.4 kernel until these patches are merged for
> > v4.5-rc, and eventually back-ported into v4.2.y stable.
> >
> > Note the recent qla2xxx target fixes from v4.5-rc1 are something you'll
> > want too.
> >
> > A v4.4 based '$ORIGIN $BRANCH' with everything is here:
> >
> >   git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending.git 4.4-stable
> >
> > If your not similar git cloning + building kernel source, have a look at:
> >
> > http://kernelnewbies.org/KernelBuild
> > https://fedoraproject.org/wiki/BuildingUpstreamKernel
> >
> > So do a fresh 'git clone' of linux-stable.git and then:
> >
> >   git checkout --track -b linux-4.4.y
> >
> > to switch to a new local branch, and then:
> >
> >   git pull '$ORIGIN $BRANCH'
> >
> > to do a remote merge using the full target-pending.git 4.4-stable path
> > from above.
> >
> > You can do a 'make defconfig' or use the local 4.2.y
> > /boot/config-$VERSION, and build vmlinux + modules + initrd
> > from there.
> >
> > Please let the list know your progress.
> >
> >> On Sun, Jan 24, 2016 at 8:11 PM, Roland Dreier <roland@xxxxxxxxxxxxxxx> wrote:
> >> >> I have tried a large number of other hosts and they all act the same
> >> >> way regardless of hardware.  ESXi <6 is no problem, but 6 and newer
> >> >> crash the filer very quickly.
> >> >
> >> > You're crashing because of
> >> >
> >> > Jan 24 10:02:09 dracofiler kernel: kernel BUG at
> >> > drivers/scsi/qla2xxx/qla_target.c:3105!
> >> >
> >> > which is the BUG_ON in
> >> >
> >> > void qlt_free_cmd(struct qla_tgt_cmd *cmd)
> >> > {
> >> >         struct qla_tgt_sess *sess = cmd->sess;
> >> >
> >> >         ql_dbg(ql_dbg_tgt, cmd->vha, 0xe074,
> >> >             "%s: se_cmd[%p] ox_id %04x\n",
> >> >             __func__, &cmd->se_cmd,
> >> >             be16_to_cpu(cmd->atio.u.isp24.fcp_hdr.ox_id));
> >> >
> >> >         BUG_ON(cmd->cmd_in_wq);
> >> >
> >> > It seems we're freeing a command before we process it.
> >> >
> >> > what logging do you have from target or qla2xxx before you hit the
> >> > crash?  I'm wondering why the initiator is aborting commands (although
> >> > we still shouldn't crash even if it does abort commands).
> >> >
> >> > You could try applying upstream commit 193b50b9d54a ("qla2xxx: Replace
> >> > QLA_TGT_STATE_ABORTED with a bit.") which seems like it might be
> >> > related, though I'm not sure whether it really will help.
> >> >
> >> >  - R.
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe target-devel" in
> >> the body of a message to majordomo@xxxxxxxxxxxxxxx
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
> >
> --
> To unsubscribe from this list: send the line "unsubscribe target-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html