On Thu, 2014-05-15 at 18:37 +0000, Moussa Ba (moussaba) wrote: > > > -----Original Message----- > > From: target-devel-owner@xxxxxxxxxxxxxxx [mailto:target-devel- > > owner@xxxxxxxxxxxxxxx] On Behalf Of Sagi Grimberg > > Sent: Thursday, May 15, 2014 9:50 AM > > To: Moussa Ba (moussaba); target-devel@xxxxxxxxxxxxxxx > > Cc: Nicholas Bellinger > > Subject: Re: 3.12.5 Target Errors > > > > On 5/15/2014 7:35 PM, Moussa Ba (moussaba) wrote: > > > We are deploying a test environment in house and received complaints > > of failures while installing OSes on target LUNs. Looking at the > > target machine, we observed the following errors in dmesg, these are > > repeated. Is this a known issue? has it been fixed? > > > > Hey Moussa, > > > > I can say that since kernel 3.12.5 ib_isert was added with some > > important stability fixes. > > > > The below list corruption seems to originate in the TX coalescing work > > that was done by Nic. > > > > Does your kernel have the below commit applied? (although I don't know > > if that went to 3.12 stable kernels...) > > commit ebbe442183b7b8192c963266f1c89048fefc63a5 > > Author: Nicholas Bellinger <nab@xxxxxxxxxxxxxxx> > > Date: Sun Mar 2 14:51:12 2014 -0800 > > > > iser-target: Fix command leak for tx_desc->comp_llnode_batch > > > > This patch addresses a number of active I/O shutdown issues > > related to isert_cmd descriptors being leaked that are part > > of a completion interrupt coalescing batch. > > > > This includes adding logic in isert_cq_tx_comp_err() to > > drain any associated tx_desc->comp_llnode_batch, as well > > as isert_cq_drain_comp_llist() to drain any associated > > isert_conn->conn_comp_llist. > > > > Also, set tx_desc->llnode_active in isert_init_send_wr() > > in order to determine when work requests need to be skipped > > in isert_cq_tx_work() exception path code. > > > > Finally, update isert_init_send_wr() to only allow interrupt > > coalescing when ISER_CONN_UP. > > > > Acked-by: Sagi Grimberg <sagig@xxxxxxxxxxxx> > > Cc: Or Gerlitz <ogerlitz@xxxxxxxxxxxx> > > Cc: <stable@xxxxxxxxxxxxxxx> #3.13+ > > Signed-off-by: Nicholas Bellinger <nab@xxxxxxxxxxxxxxx> > > > > Moreover, a more detailed scenario may also help... > > > > Cheers, > > Sagi. > > The patch above was not applied. To clarify, commit ebbe4421 is specific to >= v3.13 only, and has already been included in v3.13.y stable code. > The setup consists of a 3.12.5 target machines with 32 LUNS defined. > These are all backed by PCIe SSDs. Initiators are vmware v5.5 using > the latest Ethernet ISER drivers. LUNs are mapped to datastores. In > one instance we have received reports of targets simply disappearing > on ESX. We don't yet have dmesg output for those reports. In the > instance I reported however, the user was going through an > installation process of a CentOS VM that never completed. Checking on > the target I observed the errors I attached earlier. > > Which stable version can you recommend as including the most recent > stability fixes? We have 4 target systems deployed, and I suspect > this issue will manifests itself on all 4 hence my desire to resolve > it quickly. Thank you. So v3.12.18 includes a number of iser-target fixes related to active I/O shutdown. There are two more that are queued by Jiri for the next v3.12.y release, that are posted here: http://comments.gmane.org/gmane.linux.scsi.target.devel/6241 [PATCH-v3.12.y 1/2] iser-target: Match FRMR descriptors to available session tags [PATCH-v3.12.y 2/2] iser-target: Add missing se_cmd put for WRITE_PENDING in tx_comp_err The last set from Sagi are in target-pending/master, and will be included in v3.15-rc6 over the next days: https://git.kernel.org/cgit/linux/kernel/git/nab/target-pending.git/commit/?id=dd12980fa5110ce8d06adc889b58a6ec2d380cc4 https://git.kernel.org/cgit/linux/kernel/git/nab/target-pending.git/commit/?id=4a736d517bde31ceb1ae42142be0d9e88c526ecd https://git.kernel.org/cgit/linux/kernel/git/nab/target-pending.git/commit/?id=98719d71f58d76616e8ba755f70f39f843ec540f I'd recommend applying all 5 of these patches atop v3.12.18. --nab -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html