Re: qla2xxx crashing kernel

"Nicholas A. Bellinger" <nab@xxxxxxxxxxxxxxx> · Wed, 17 Sep 2014 17:34:45 -0700

Hi Craig,

Thanks for the detailed bug report.  Comments are inline below.

Also, adding some QLogic folks on the CC'..

On Fri, 2014-09-05 at 19:53 -0400, Craig Watson wrote:
> Hi Nicholas,
> 
> One of our systems is seeing a kernel BUG with a fibre channel target
> system.  The system is an Intel i7 with a qle2562 FC HBA running FC20
> with kernel 3.14.4-200.  (We have seen this with several older kernel
> versions too).
> 
> The LIO component versions are:
>      targetcli    2.1.fb35-1
>      rtslib       2.1.fb47-1
>      configshell  1.1.fb13-1
> 
> The bug/crash is repeatable.  It happens every time the LIO target
> system is attached to the initiator.  The target is connected to an
> older VxWorks initiator system.  This system has been in service for
> years and works with several models of FC disk drives.  The new
> LIO-based system is an attempt to replace aging hardware with newer,
> more reliable, and higher performance storage.
> 

<nod>, thanks for the extra background on your setup.

> Here is the console output of the kernel BUG:
> 
> [ 2669.734186] ------------[ cut here ]------------
> [ 2669.738893] kernel BUG at drivers/scsi/qla2xxx/qla_target.c:1343!
> [ 2669.745080] invalid opcode: 0000 [#1] SMP
> [ 2669.749275] Modules linked in: tcm_qla2xxx target_core_pscsi
> target_core_file target_core_iblock iscsi_target_mod target_core_mod
> cfg80211 rfkill snd_hda_codec_realtek snd_hda_codec_generic
> snd_hda_intel snd_hda_codec serio_raw snd_hwdep snd_pcm snd_timer
> x86_pkg_temp_thermal snd e1000e ptp iTCO_wdt mei_me iTCO_vendor_support
> mei lpc_ich mfd_core raid0 i2c_i801 shpchp pps_core soundcore coretemp
> kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel
> ghash_clmulni_intel microcode usb_storage ata_generic pata_acpi i915
> pata_jmicron qla2xxx i2c_algo_bit drm_kms_helper drm scsi_transport_fc
> scsi_tgt i2c_core video
> [ 2669.805324] CPU: 1 PID: 558 Comm: kworker/1:0 Not tainted 3.14.4-200.fc20.x86_64 #1
> [ 2669.813052] Hardware name: Intel Corporation ChiefRiver Platform/PCOM-B219VG, BIOS R1.00.W3 09/24/2012
> [ 2669.822485] Workqueue: qla_tgt_wq qlt_do_work [qla2xxx]
> [ 2669.827810] task: ffff8800d28f1c80 ti: ffff8800d3eea000 task.ti: ffff8800d3eea000
> [ 2669.835399] RIP: 0010:[<ffffffffa00edfe4>] [<ffffffffa00edfe4>] qlt_pci_map_calc_cnt+0x164/0x1a0 [qla2xxx]
> [ 2669.845308] RSP: 0018:ffff8800d3eebba8  EFLAGS: 00010246
> [ 2669.850712] RAX: ffff880117d3b400 RBX: ffff880036b9d6f0 RCX: ffffffffa010a19a
> [ 2669.857957] RDX: 0000000000000000 RSI: ffff880036b9d6f0 RDI: ffff8800d3eebc00
> [ 2669.865190] RBP: ffff8800d3eebbe0 R08: 0000000000000000 R09: 0000000000000020
> [ 2669.872418] R10: 0000000000000002 R11: 0000000000000000 R12: ffff8800d3eebc00
> [ 2669.879646] R13: ffff8800d3eebc00 R14: ffff8800d1e30000 R15: ffff880117d3
> [ 2669.886897] FS:  0000000000000000(0000) GS:ffff88011e240000(0000) knlGS:0000000000000000
> [ 2669.895102] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 2669.900937] CR2: 00007f5efb75b000 CR3: 0000000001c0c000 CR4: 00000000001407e0
> [ 2669.908144] Stack:
> [ 2669.910187]  000000006acfe608 0000000100000000 ffff880036b9d6f0 ffff880117d3b400
> [ 2669.917797]  ffff8800d3eebc00 ffff8800d1e30000 ffff880117d3b72c ffff8800d3eebc78
> [ 2669.925387]  ffffffffa00ef933 0000000000000000 0000000000000000 ffff880117d3b400
> [ 2669.932969] Call Trace:
> [ 2669.935455]  [<ffffffffa00ef933>] qlt_rdy_to_xfer+0x93/0x210 [qla2xxx]
> [ 2669.942070]  [<ffffffffa03c267c>] tcm_qla2xxx_write_pending+0x5c/0x60 [tcm_qla2xxx]
> [ 2669.949855]  [<ffffffffa04ff677>] transport_generic_new_cmd+0x127/0x1a0 [target_core_mod]
> [ 2669.958184]  [<ffffffffa04ff72b>] transport_handle_cdb_direct+0x3b/0x90 [target_core_mod]
> [ 2669.966510]  [<ffffffffa04ff8c2>] target_submit_cmd_map_sgls+0x142/0x280 [target_core_mod]
> [ 2669.974917]  [<ffffffffa04ffa59>] target_submit_cmd+0x59/0x60 [target_core_mod]
> [ 2669.982336]  [<ffffffffa03c25ea>] tcm_qla2xxx_handle_cmd+0x5a/0x90 [tcm_qla2xxx]
> [ 2669.989847]  [<ffffffffa00f0624>] qlt_do_work+0x2e4/0x460 [qla2xxx]
> [ 2669.996231]  [<ffffffff810a6886>] process_one_work+0x176/0x430
> [ 2670.002163]  [<ffffffff810a74db>] worker_thread+0x11b/0x3a0
> [ 2670.007809]  [<ffffffff810a73c0>] ? rescuer_thread+0x370/0x370
> [ 2670.013710]  [<ffffffff810ae211>] kthread+0xe1/0x100
> [ 2670.018763]  [<ffffffff810ae130>] ? insert_kthread_work+0x40/0x40
> [ 2670.024943]  [<ffffffff816ff4bc>] ret_from_fork+0x7c/0xb0
> [ 2670.031673]  [<ffffffff810ae130>] ? insert_kthread_work+0x40/0x40
> [ 2670.039056] Code: 0f 1f 80 00 00 00 00 45 8b 4c 24 24 eb be 66 0f 1f 84 00
>                       00 00 00 00 48 c7 45 c8 00 00 00 00 4c 8b 3d 01 a6 b2 e1
>                       e9 18 ff ff ff <0f> 0b 48 8b b0 f8 02 00 00 44 8b 88 d0
>                       02 00 00 45 31 c0 31 c0
> [ 2670.062304] RIP  [<ffffffffa00edfe4>] qlt_pci_map_calc_cnt+0x164/0x1a0 [qla2xxx]
> [ 2670.071041]  RSP <ffff8800d3eebba8>
> [ 2670.201729] ---[ end trace ec1db6cbc3980106 ]---
> [ 2670.207583] BUG: unable to handle kernel paging request at ffffffffffffffd8
> [ 2670.215756] IP: [<ffffffff810ae800>] kthread_data+0x10/0x20
> [ 2670.222563] PGD 1c0f067 PUD 1c11067 PMD 0
> [ 2670.227968] Oops: 0000 [#2] SMP
> [ 2670.232458] Modules linked in: tcm_qla2xxx target_core_pscsi
> target_core_file target_core_iblock iscsi_target_mod target_core_mod 
> cfg80211
> rfkill snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel 
> snd_hda_codec
> serio_raw snd_hwdep snd_pcm snd_timer x86_pkg_temp_thermal snd e1000e ptp
> iTCO_wdt mei_me iTCO_vendor_support mei lpc_ich mfd_core raid0 i2c_i801 
> shpchp
> pps_core soundcore coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul
> crc32c_intel ghash_clmulni_intel microcode usb_storage ata_generic pata_acpi
> i915 pata_jmicron qla2xxx i2c_algo_bit drm_kms_helper drm scsi_transport_fc
> scsi_tgt i2c_core video
> [ 2670.293550] CPU: 1 PID: 558 Comm: kworker/1:0 Tainted: G D 3.14.4-200.fc20.x86_64 #1
> [ 2670.303354] Hardware name: Intel Corporation ChiefRiver Platform/PCOM-B219VG,BIOS R1.00.W3 09/24/2012
> [ 2670.313751] task: ffff8800d28f1c80 ti: ffff8800d3eea000 task.ti: ffff8800d3eea000
> [ 2670.322257] RIP: 0010:[<ffffffff810ae800>] [<ffffffff810ae800>] kthread_data+0x10/0x20
> [ 2670.331302] RSP: 0018:ffff8800d3eeb880  EFLAGS: 00010002
> [ 2670.337864] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 00000000000000f
> [ 2670.346306] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8800d28f1c80
> [ 2670.354446] RBP: ffff8800d3eeb880 R08: ffff8800d28f1d10 R09: 00000001800f000c
> [ 2670.362579] R10: ffffffff8108760c R11: ffffea00034fc200 R12: ffff88011e254640
> [ 2670.370972] R13: 0000000000000001 R14: ffff8800d28f1c70 R15: ffff8800d28f1c80
> [ 2670.379385] FS:  0000000000000000(0000) GS:ffff88011e240000(0000) knlGS:0000000000000000
> [ 2670.388826] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 2670.395561] CR2: 0000000000000028 CR3: 0000000001c0c000 CR4: 00000000001407e0
> [ 2670.403681] Stack:
> [ 2670.406688]  ffff8800d3eeb898 ffffffff810a7b91 ffff8800d28f1c80 ffff8800d3eeb8f8
> [ 2670.415307]  ffffffff816f2b89 ffff8800d28f1c80 ffff8800d3eebfd8 0000000000014640
> [ 2670.424064]  0000000000014640 ffff8800d28f1c80 ffff8800d28f2318 ffff8800d3eeb5d0
> [ 2670.432560] Call Trace:
> [ 2670.436195]  [<ffffffff810a7b91>] wq_worker_sleeping+0x11/0x90
> [ 2670.443065]  [<ffffffff816f2b89>] __schedule+0x4a9/0x740
> [ 2670.449512]  [<ffffffff816f2e49>] schedule+0x29/0x70
> [ 2670.455693]  [<ffffffff8108cb47>] do_exit+0x6b7/0xa30
> [ 2670.461709]  [<ffffffff816f758c>] oops_end+0x9c/0xe0
> [ 2670.467649]  [<ffffffff81017fcb>] die+0x4b/0x70
> [ 2670.473359]  [<ffffffff816f6e20>] do_trap+0x60/0x170
> [ 2670.479445]  [<ffffffff810150aa>] do_invalid_op+0xaa/0xe0
> [ 2670.485816]  [<ffffffffa00edfe4>] ? qlt_pci_map_calc_cnt+0x164/0x1a0 [qla2xxx]
> [ 2670.494264]  [<ffffffff8117e148>] ? __alloc_pages_nodemask+0x188/0xb60
> [ 2670.501765]  [<ffffffff81700c9e>] invalid_op+0x1e/0x30
> [ 2670.508219]  [<ffffffffa00edfe4>] ? qlt_pci_map_calc_cnt+0x164/0x1a0 [qla2xxx]
> [ 2670.516726]  [<ffffffffa00ef933>] qlt_rdy_to_xfer+0x93/0x210 [qla2xxx]
> [ 2670.524462]  [<ffffffffa03c267c>] tcm_qla2xxx_write_pending+0x5c/0x60 [tcm_qla2xxx]
> [ 2670.533387]  [<ffffffffa04ff677>] transport_generic_new_cmd+0x127/0x1a0 [target_core_mod]
> [ 2670.542841]  [<ffffffffa04ff72b>] transport_handle_cdb_direct+0x3b/0x90 [target_core_mod]
> [ 2670.552307]  [<ffffffffa04ff8c2>] target_submit_cmd_map_sgls+0x142/0x280 [target_core_mod]
> [ 2670.561778]  [<ffffffffa04ffa59>] target_submit_cmd+0x59/0x60 [target_core_mod]
> [ 2670.570364]  [<ffffffffa03c25ea>] tcm_qla2xxx_handle_cmd+0x5a/0x90 [tcm_qla2xxx]
> [ 2670.578795]  [<ffffffffa00f0624>] qlt_do_work+0x2e4/0x460 [qla2xxx]
> [ 2670.586343]  [<ffffffff810a6886>] process_one_work+0x176/0x430
> [ 2670.593454]  [<ffffffff810a74db>] worker_thread+0x11b/0x3a0
> [ 2670.600058]  [<ffffffff810a73c0>] ? rescuer_thread+0x370/0x370
> [ 2670.606910]  [<ffffffff810ae211>] kthread+0xe1/0x100
> [ 2670.612885]  [<ffffffff810ae130>] ? insert_kthread_work+0x40/0x40
> [ 2670.620058]  [<ffffffff816ff4bc>] ret_from_fork+0x7c/0xb0
> [ 2670.626578]  [<ffffffff810ae130>] ? insert_kthread_work+0x40/0x40
> [ 2670.633697] Code: 00 48 89 e5 5d 48 8b 40 c8 48 c1 e8 02 83 e0 01 c3 66 2e
> 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 87 28 04 00 00 55 48 89 e5 <48> 8b
> 40 d8 5d c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00
> [ 2670.656112] RIP  [<ffffffff810ae800>] kthread_data+0x10/0x20
> [ 2670.662623]  RSP <ffff8800d3eeb880>
> [ 2670.666936] CR2: ffffffffffffffd8
> [ 2670.671170] ---[ end trace ec1db6cbc3980107 ]---
> [ 2670.676804] Fixing recursive fault but reboot is needed!
> [ 2691.628643] ------------[ cut here ]------------
> 
> 
> Looking at protocol traces on a Fibre Channel Protocol Analyzer, a Report
> LUNs command requesting 0x1008 bytes of data (FCP_DL=0x1008) appears to
> trigger the crash.
> 
> It is a rare target that has 0x1008 (5004) bytes of LUN data to return.
> What should happen is the actual data available should be returned and a
> residual under flag be set in the response along with the residual under
> field being set to the number of bytes not returned.  This system
> actually is occasionally attached to a large storage array and
> presumably that is why the large buffer is allocated for LUNs responses.
> 

AFAICT, the 'what should happen' is the current behavior in target core.

That is, the REPORT_LUNS response sets the underflow bit and associated
residual length in target_complete_cmd_with_length() when the outgoing
REPORT_LUNS payload is smaller than the received SCSI EDTL.

> Since many of the Qla2xxx operations seem to pad (with zeros) the
> responses out to whatever the requested data length was, maybe this is
> overrunning a 4k buffer somewhere?  Note, the responses should not be
> padding and returning more data than what is actually available but this
> is what we are seeing on the analyzer.  The data length on the request
> is specifying a maximum to return, not how much has to be returned.
> 

Ok, based on the comments below, I think this is a separate issue that
is specific to qla2xxx fabric, but is not the cause of the above
BUG_ON().

> So, this appears to be a combination of 2 problems.  1) Returns are
> padding with zeros to the size of the request rather than truncating to
> the size of actual data on hand. 2) Apparently a 4k buffer is being
> overrun and crashing the system when a 0x1008 (4k + 8) Request for LUNs
> command is received (this is speculation on my part).
> 

So based upon the above trace, the BUG_ON() being triggered is in the
WRITE code-path and doesn't appear related to a specific REPORT_LUNS
payload size.

Namely, there is a check in qlt_pci_map_calc_cnt() that ensures that a
non zero scatterlist count exists when a given command's scatterlists
are mapped into PCI address space for receiving the incoming WRITE
payload.

I'm thinking this could very well be some form of zero-length WRITE that
is occurring immediately after the REPORT_LUNs.

Is anything else showing up on the trace after the REPORT_LUNs..?

> Note the target system only has 1 LUN and should never come close to
> returning 4k of data.  What we actually see are 2 response PDUs with 2k
> (LEX=0x0800) of data.  This is mostly 0's, only the first 16-24 bytes
> has information.  These are followed by a response PDU with 8 bytes (of
> zeros).  The system then crashes hard.
> 
> We have traces of the initiator system connected to a Hitachi FC disk
> drive where the same Report LUNs command (w/FCP_DL=0x1008) receives a
> single response with 16 bytes of data (Len=0x0010).  The drive functions
> fine after this.
> 
> Looking at another initiator system and our LIO target, we see Requests
> for LUNs asking for approximately 2k of data and the LIO-based target
> system responds without crashing but does return approximately 2k of
> data (mostly zeros).
> 
> Please let me know if I can provide any more information to facilitate
> getting this fixed.  Access to the systems is extremely limited though.
> 

Thanks.  I need to verify zero-length WRITEs work as expected first, and
then will take a look at why REPORT_LUNS is returning the extra payload
with qla2xxx.

What would also be useful is to identify the specific WRITE CDB that is
triggering the sanity check in qlt_pci_map_calc_cnt() that is being
triggered.  Below is a debug patch to dump some information before
invoking BUG_ON().  Would it be possible to apply it to your build and
reproduce..?

Thanks again,

--nab

diff --git a/drivers/scsi/qla2xxx/qla_target.c b/drivers/scsi/qla2xxx/qla_target.c
index e632e14..9a97707 100644
--- a/drivers/scsi/qla2xxx/qla_target.c
+++ b/drivers/scsi/qla2xxx/qla_target.c
@@ -1350,7 +1350,21 @@ static int qlt_pci_map_calc_cnt(struct qla_tgt_prm *prm)
 {
        struct qla_tgt_cmd *cmd = prm->cmd;
 
-       BUG_ON(cmd->sg_cnt == 0);
+       if (cmd->sg_cnt == 0) {
+               struct se_cmd *se_cmd = &cmd->se_cmd;
+               int i;
+
+               printk("qlt_pci_map_calc_cnt cmd->bufflen: %u\n", cmd->bufflen);
+               printk("qlt_pci_map_calc_cnt se_cmd->data_length: %u\n",
+                       se_cmd->data_length);
+               printk("qlt_pci_map_calc_cnt se_cmd->t_task_cdb: 0x%02x",
+                       se_cmd->t_task_cdb[0]);
+               for (i = 1; i < 16; i++)
+                       printk(" %02x", se_cmd->t_task_cdb[i]);
+
+               printk("\n");
+               BUG();
+       }
 
        prm->sg = (struct scatterlist *)cmd->sg;
        prm->seg_cnt = pci_map_sg(prm->tgt->ha->pdev, cmd->sg,

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html