Re: 3.12.5 Target Errors

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2014-05-15 at 14:18 -0700, Nicholas A. Bellinger wrote:
> On Thu, 2014-05-15 at 18:37 +0000, Moussa Ba (moussaba) wrote:
> > 
> > > -----Original Message-----
> > > From: target-devel-owner@xxxxxxxxxxxxxxx [mailto:target-devel-
> > > owner@xxxxxxxxxxxxxxx] On Behalf Of Sagi Grimberg
> > > Sent: Thursday, May 15, 2014 9:50 AM
> > > To: Moussa Ba (moussaba); target-devel@xxxxxxxxxxxxxxx
> > > Cc: Nicholas Bellinger
> > > Subject: Re: 3.12.5 Target Errors
> > > 
> > > On 5/15/2014 7:35 PM, Moussa Ba (moussaba) wrote:
> > > > We are deploying a test environment in house and received complaints
> > > of failures while installing OSes on target LUNs.  Looking at the
> > > target machine, we observed the following errors in dmesg, these are
> > > repeated.  Is this a known issue? has it been fixed?
> > > 
> > > Hey Moussa,
> > > 
> > > I can say that since kernel 3.12.5 ib_isert was added with some
> > > important stability fixes.
> > > 
> > > The below list corruption seems to originate in the TX coalescing work
> > > that was done by Nic.
> > > 
> > > Does your kernel have the below commit applied? (although I don't know
> > > if that went to 3.12 stable kernels...)
> > > commit ebbe442183b7b8192c963266f1c89048fefc63a5
> > > Author: Nicholas Bellinger <nab@xxxxxxxxxxxxxxx>
> > > Date:   Sun Mar 2 14:51:12 2014 -0800
> > > 
> > >      iser-target: Fix command leak for tx_desc->comp_llnode_batch
> > > 
> > >      This patch addresses a number of active I/O shutdown issues
> > >      related to isert_cmd descriptors being leaked that are part
> > >      of a completion interrupt coalescing batch.
> > > 
> > >      This includes adding logic in isert_cq_tx_comp_err() to
> > >      drain any associated tx_desc->comp_llnode_batch, as well
> > >      as isert_cq_drain_comp_llist() to drain any associated
> > >      isert_conn->conn_comp_llist.
> > > 
> > >      Also, set tx_desc->llnode_active in isert_init_send_wr()
> > >      in order to determine when work requests need to be skipped
> > >      in isert_cq_tx_work() exception path code.
> > > 
> > >      Finally, update isert_init_send_wr() to only allow interrupt
> > >      coalescing when ISER_CONN_UP.
> > > 
> > >      Acked-by: Sagi Grimberg <sagig@xxxxxxxxxxxx>
> > >      Cc: Or Gerlitz <ogerlitz@xxxxxxxxxxxx>
> > >      Cc: <stable@xxxxxxxxxxxxxxx> #3.13+
> > >      Signed-off-by: Nicholas Bellinger <nab@xxxxxxxxxxxxxxx>
> > > 
> > > Moreover, a more detailed scenario may also help...
> > > 
> > > Cheers,
> > > Sagi.
> > 
> > The patch above was not applied.
> 
> To clarify, commit ebbe4421 is specific to >= v3.13 only, and has
> already been included in v3.13.y stable code.
> 
> >   The setup consists of a 3.12.5 target machines with 32 LUNS defined.
> > These are all backed by PCIe SSDs.  Initiators are vmware v5.5 using
> > the latest Ethernet ISER drivers.  LUNs are mapped to datastores.  In
> > one instance we have received reports of targets simply disappearing
> > on ESX. We don't yet have dmesg output for those reports. In the
> > instance I reported however, the user was going through an
> > installation process of a CentOS VM that never completed. Checking on
> > the target I observed the errors I attached earlier.  
> > 
> > Which stable version can you recommend as including the most recent
> > stability fixes?  We have 4 target systems deployed, and I suspect
> > this issue will manifests itself on all 4 hence my desire to resolve
> > it  quickly. Thank you.
> 
> So v3.12.18 includes a number of iser-target fixes related to active I/O
> shutdown.  There are two more that are queued by Jiri for the next
> v3.12.y release, 

Ah sorry.  v3.12.19 is the current stable release, and v3.12.20 will
include the two PATCH-v3.12.y below..

--nab

> that are posted here:
> 
> http://comments.gmane.org/gmane.linux.scsi.target.devel/6241
> 
> [PATCH-v3.12.y 1/2] iser-target: Match FRMR descriptors to available session tags
> [PATCH-v3.12.y 2/2] iser-target: Add missing se_cmd put for WRITE_PENDING in tx_comp_err
> 
> The last set from Sagi are in target-pending/master, and will be
> included in v3.15-rc6 over the next days:
> 
> https://git.kernel.org/cgit/linux/kernel/git/nab/target-pending.git/commit/?id=dd12980fa5110ce8d06adc889b58a6ec2d380cc4
> https://git.kernel.org/cgit/linux/kernel/git/nab/target-pending.git/commit/?id=4a736d517bde31ceb1ae42142be0d9e88c526ecd
> https://git.kernel.org/cgit/linux/kernel/git/nab/target-pending.git/commit/?id=98719d71f58d76616e8ba755f70f39f843ec540f
> 
> I'd recommend applying all 5 of these patches atop v3.12.18.
> 
> --nab
> 
> --
> To unsubscribe from this list: send the line "unsubscribe target-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux