Re: 3.12.5 Target Errors

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2014-05-15 at 18:37 +0000, Moussa Ba (moussaba) wrote:
> 
> > -----Original Message-----
> > From: target-devel-owner@xxxxxxxxxxxxxxx [mailto:target-devel-
> > owner@xxxxxxxxxxxxxxx] On Behalf Of Sagi Grimberg
> > Sent: Thursday, May 15, 2014 9:50 AM
> > To: Moussa Ba (moussaba); target-devel@xxxxxxxxxxxxxxx
> > Cc: Nicholas Bellinger
> > Subject: Re: 3.12.5 Target Errors
> > 
> > On 5/15/2014 7:35 PM, Moussa Ba (moussaba) wrote:
> > > We are deploying a test environment in house and received complaints
> > of failures while installing OSes on target LUNs.  Looking at the
> > target machine, we observed the following errors in dmesg, these are
> > repeated.  Is this a known issue? has it been fixed?
> > 
> > Hey Moussa,
> > 
> > I can say that since kernel 3.12.5 ib_isert was added with some
> > important stability fixes.
> > 
> > The below list corruption seems to originate in the TX coalescing work
> > that was done by Nic.
> > 
> > Does your kernel have the below commit applied? (although I don't know
> > if that went to 3.12 stable kernels...)
> > commit ebbe442183b7b8192c963266f1c89048fefc63a5
> > Author: Nicholas Bellinger <nab@xxxxxxxxxxxxxxx>
> > Date:   Sun Mar 2 14:51:12 2014 -0800
> > 
> >      iser-target: Fix command leak for tx_desc->comp_llnode_batch
> > 
> >      This patch addresses a number of active I/O shutdown issues
> >      related to isert_cmd descriptors being leaked that are part
> >      of a completion interrupt coalescing batch.
> > 
> >      This includes adding logic in isert_cq_tx_comp_err() to
> >      drain any associated tx_desc->comp_llnode_batch, as well
> >      as isert_cq_drain_comp_llist() to drain any associated
> >      isert_conn->conn_comp_llist.
> > 
> >      Also, set tx_desc->llnode_active in isert_init_send_wr()
> >      in order to determine when work requests need to be skipped
> >      in isert_cq_tx_work() exception path code.
> > 
> >      Finally, update isert_init_send_wr() to only allow interrupt
> >      coalescing when ISER_CONN_UP.
> > 
> >      Acked-by: Sagi Grimberg <sagig@xxxxxxxxxxxx>
> >      Cc: Or Gerlitz <ogerlitz@xxxxxxxxxxxx>
> >      Cc: <stable@xxxxxxxxxxxxxxx> #3.13+
> >      Signed-off-by: Nicholas Bellinger <nab@xxxxxxxxxxxxxxx>
> > 
> > Moreover, a more detailed scenario may also help...
> > 
> > Cheers,
> > Sagi.
> 
> The patch above was not applied.

To clarify, commit ebbe4421 is specific to >= v3.13 only, and has
already been included in v3.13.y stable code.

>   The setup consists of a 3.12.5 target machines with 32 LUNS defined.
> These are all backed by PCIe SSDs.  Initiators are vmware v5.5 using
> the latest Ethernet ISER drivers.  LUNs are mapped to datastores.  In
> one instance we have received reports of targets simply disappearing
> on ESX. We don't yet have dmesg output for those reports. In the
> instance I reported however, the user was going through an
> installation process of a CentOS VM that never completed. Checking on
> the target I observed the errors I attached earlier.  
> 
> Which stable version can you recommend as including the most recent
> stability fixes?  We have 4 target systems deployed, and I suspect
> this issue will manifests itself on all 4 hence my desire to resolve
> it  quickly. Thank you.

So v3.12.18 includes a number of iser-target fixes related to active I/O
shutdown.  There are two more that are queued by Jiri for the next
v3.12.y release, that are posted here:

http://comments.gmane.org/gmane.linux.scsi.target.devel/6241

[PATCH-v3.12.y 1/2] iser-target: Match FRMR descriptors to available session tags
[PATCH-v3.12.y 2/2] iser-target: Add missing se_cmd put for WRITE_PENDING in tx_comp_err

The last set from Sagi are in target-pending/master, and will be
included in v3.15-rc6 over the next days:

https://git.kernel.org/cgit/linux/kernel/git/nab/target-pending.git/commit/?id=dd12980fa5110ce8d06adc889b58a6ec2d380cc4
https://git.kernel.org/cgit/linux/kernel/git/nab/target-pending.git/commit/?id=4a736d517bde31ceb1ae42142be0d9e88c526ecd
https://git.kernel.org/cgit/linux/kernel/git/nab/target-pending.git/commit/?id=98719d71f58d76616e8ba755f70f39f843ec540f

I'd recommend applying all 5 of these patches atop v3.12.18.

--nab

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux