Re: qla2xxx crashing kernel

"Nicholas A. Bellinger" <nab@xxxxxxxxxxxxxxx> · Fri, 03 Oct 2014 14:04:37 -0700

Hi Craig,

A quick update on this bug.  Comments below.

On Wed, 2014-09-17 at 17:34 -0700, Nicholas A. Bellinger wrote:
> Hi Craig,
> 
> Thanks for the detailed bug report.  Comments are inline below.
> 
> Also, adding some QLogic folks on the CC'..
> 
> On Fri, 2014-09-05 at 19:53 -0400, Craig Watson wrote:
> > Hi Nicholas,
> > 
> > One of our systems is seeing a kernel BUG with a fibre channel target
> > system.  The system is an Intel i7 with a qle2562 FC HBA running FC20
> > with kernel 3.14.4-200.  (We have seen this with several older kernel
> > versions too).
> > 
> > The LIO component versions are:
> >      targetcli    2.1.fb35-1
> >      rtslib       2.1.fb47-1
> >      configshell  1.1.fb13-1
> > 
> > The bug/crash is repeatable.  It happens every time the LIO target
> > system is attached to the initiator.  The target is connected to an
> > older VxWorks initiator system.  This system has been in service for
> > years and works with several models of FC disk drives.  The new
> > LIO-based system is an attempt to replace aging hardware with newer,
> > more reliable, and higher performance storage.
> > 
> 
> <nod>, thanks for the extra background on your setup.
> 
> > Here is the console output of the kernel BUG:
> > 
> > [ 2669.734186] ------------[ cut here ]------------
> > [ 2669.738893] kernel BUG at drivers/scsi/qla2xxx/qla_target.c:1343!
> > [ 2669.745080] invalid opcode: 0000 [#1] SMP
> > [ 2669.749275] Modules linked in: tcm_qla2xxx target_core_pscsi
> > target_core_file target_core_iblock iscsi_target_mod target_core_mod
> > cfg80211 rfkill snd_hda_codec_realtek snd_hda_codec_generic
> > snd_hda_intel snd_hda_codec serio_raw snd_hwdep snd_pcm snd_timer
> > x86_pkg_temp_thermal snd e1000e ptp iTCO_wdt mei_me iTCO_vendor_support
> > mei lpc_ich mfd_core raid0 i2c_i801 shpchp pps_core soundcore coretemp
> > kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel
> > ghash_clmulni_intel microcode usb_storage ata_generic pata_acpi i915
> > pata_jmicron qla2xxx i2c_algo_bit drm_kms_helper drm scsi_transport_fc
> > scsi_tgt i2c_core video
> > [ 2669.805324] CPU: 1 PID: 558 Comm: kworker/1:0 Not tainted 3.14.4-200.fc20.x86_64 #1
> > [ 2669.813052] Hardware name: Intel Corporation ChiefRiver Platform/PCOM-B219VG, BIOS R1.00.W3 09/24/2012
> > [ 2669.822485] Workqueue: qla_tgt_wq qlt_do_work [qla2xxx]
> > [ 2669.827810] task: ffff8800d28f1c80 ti: ffff8800d3eea000 task.ti: ffff8800d3eea000
> > [ 2669.835399] RIP: 0010:[<ffffffffa00edfe4>] [<ffffffffa00edfe4>] qlt_pci_map_calc_cnt+0x164/0x1a0 [qla2xxx]
> > [ 2669.845308] RSP: 0018:ffff8800d3eebba8  EFLAGS: 00010246
> > [ 2669.850712] RAX: ffff880117d3b400 RBX: ffff880036b9d6f0 RCX: ffffffffa010a19a
> > [ 2669.857957] RDX: 0000000000000000 RSI: ffff880036b9d6f0 RDI: ffff8800d3eebc00
> > [ 2669.865190] RBP: ffff8800d3eebbe0 R08: 0000000000000000 R09: 0000000000000020
> > [ 2669.872418] R10: 0000000000000002 R11: 0000000000000000 R12: ffff8800d3eebc00
> > [ 2669.879646] R13: ffff8800d3eebc00 R14: ffff8800d1e30000 R15: ffff880117d3
> > [ 2669.886897] FS:  0000000000000000(0000) GS:ffff88011e240000(0000) knlGS:0000000000000000
> > [ 2669.895102] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 2669.900937] CR2: 00007f5efb75b000 CR3: 0000000001c0c000 CR4: 00000000001407e0
> > [ 2669.908144] Stack:
> > [ 2669.910187]  000000006acfe608 0000000100000000 ffff880036b9d6f0 ffff880117d3b400
> > [ 2669.917797]  ffff8800d3eebc00 ffff8800d1e30000 ffff880117d3b72c ffff8800d3eebc78
> > [ 2669.925387]  ffffffffa00ef933 0000000000000000 0000000000000000 ffff880117d3b400
> > [ 2669.932969] Call Trace:
> > [ 2669.935455]  [<ffffffffa00ef933>] qlt_rdy_to_xfer+0x93/0x210 [qla2xxx]
> > [ 2669.942070]  [<ffffffffa03c267c>] tcm_qla2xxx_write_pending+0x5c/0x60 [tcm_qla2xxx]
> > [ 2669.949855]  [<ffffffffa04ff677>] transport_generic_new_cmd+0x127/0x1a0 [target_core_mod]
> > [ 2669.958184]  [<ffffffffa04ff72b>] transport_handle_cdb_direct+0x3b/0x90 [target_core_mod]
> > [ 2669.966510]  [<ffffffffa04ff8c2>] target_submit_cmd_map_sgls+0x142/0x280 [target_core_mod]
> > [ 2669.974917]  [<ffffffffa04ffa59>] target_submit_cmd+0x59/0x60 [target_core_mod]
> > [ 2669.982336]  [<ffffffffa03c25ea>] tcm_qla2xxx_handle_cmd+0x5a/0x90 [tcm_qla2xxx]
> > [ 2669.989847]  [<ffffffffa00f0624>] qlt_do_work+0x2e4/0x460 [qla2xxx]
> > [ 2669.996231]  [<ffffffff810a6886>] process_one_work+0x176/0x430
> > [ 2670.002163]  [<ffffffff810a74db>] worker_thread+0x11b/0x3a0
> > [ 2670.007809]  [<ffffffff810a73c0>] ? rescuer_thread+0x370/0x370
> > [ 2670.013710]  [<ffffffff810ae211>] kthread+0xe1/0x100
> > [ 2670.018763]  [<ffffffff810ae130>] ? insert_kthread_work+0x40/0x40
> > [ 2670.024943]  [<ffffffff816ff4bc>] ret_from_fork+0x7c/0xb0
> > [ 2670.031673]  [<ffffffff810ae130>] ? insert_kthread_work+0x40/0x40
> > [ 2670.039056] Code: 0f 1f 80 00 00 00 00 45 8b 4c 24 24 eb be 66 0f 1f 84 00
> >                       00 00 00 00 48 c7 45 c8 00 00 00 00 4c 8b 3d 01 a6 b2 e1
> >                       e9 18 ff ff ff <0f> 0b 48 8b b0 f8 02 00 00 44 8b 88 d0
> >                       02 00 00 45 31 c0 31 c0
> > [ 2670.062304] RIP  [<ffffffffa00edfe4>] qlt_pci_map_calc_cnt+0x164/0x1a0 [qla2xxx]
> > [ 2670.071041]  RSP <ffff8800d3eebba8>
> > [ 2670.201729] ---[ end trace ec1db6cbc3980106 ]---
> > [ 2670.207583] BUG: unable to handle kernel paging request at ffffffffffffffd8
> > [ 2670.215756] IP: [<ffffffff810ae800>] kthread_data+0x10/0x20
> > [ 2670.222563] PGD 1c0f067 PUD 1c11067 PMD 0
> > [ 2670.227968] Oops: 0000 [#2] SMP
> > [ 2670.232458] Modules linked in: tcm_qla2xxx target_core_pscsi
> > target_core_file target_core_iblock iscsi_target_mod target_core_mod 
> > cfg80211
> > rfkill snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel 
> > snd_hda_codec
> > serio_raw snd_hwdep snd_pcm snd_timer x86_pkg_temp_thermal snd e1000e ptp
> > iTCO_wdt mei_me iTCO_vendor_support mei lpc_ich mfd_core raid0 i2c_i801 
> > shpchp
> > pps_core soundcore coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul
> > crc32c_intel ghash_clmulni_intel microcode usb_storage ata_generic pata_acpi
> > i915 pata_jmicron qla2xxx i2c_algo_bit drm_kms_helper drm scsi_transport_fc
> > scsi_tgt i2c_core video
> > [ 2670.293550] CPU: 1 PID: 558 Comm: kworker/1:0 Tainted: G D 3.14.4-200.fc20.x86_64 #1
> > [ 2670.303354] Hardware name: Intel Corporation ChiefRiver Platform/PCOM-B219VG,BIOS R1.00.W3 09/24/2012
> > [ 2670.313751] task: ffff8800d28f1c80 ti: ffff8800d3eea000 task.ti: ffff8800d3eea000
> > [ 2670.322257] RIP: 0010:[<ffffffff810ae800>] [<ffffffff810ae800>] kthread_data+0x10/0x20
> > [ 2670.331302] RSP: 0018:ffff8800d3eeb880  EFLAGS: 00010002
> > [ 2670.337864] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 00000000000000f
> > [ 2670.346306] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8800d28f1c80
> > [ 2670.354446] RBP: ffff8800d3eeb880 R08: ffff8800d28f1d10 R09: 00000001800f000c
> > [ 2670.362579] R10: ffffffff8108760c R11: ffffea00034fc200 R12: ffff88011e254640
> > [ 2670.370972] R13: 0000000000000001 R14: ffff8800d28f1c70 R15: ffff8800d28f1c80
> > [ 2670.379385] FS:  0000000000000000(0000) GS:ffff88011e240000(0000) knlGS:0000000000000000
> > [ 2670.388826] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 2670.395561] CR2: 0000000000000028 CR3: 0000000001c0c000 CR4: 00000000001407e0
> > [ 2670.403681] Stack:
> > [ 2670.406688]  ffff8800d3eeb898 ffffffff810a7b91 ffff8800d28f1c80 ffff8800d3eeb8f8
> > [ 2670.415307]  ffffffff816f2b89 ffff8800d28f1c80 ffff8800d3eebfd8 0000000000014640
> > [ 2670.424064]  0000000000014640 ffff8800d28f1c80 ffff8800d28f2318 ffff8800d3eeb5d0
> > [ 2670.432560] Call Trace:
> > [ 2670.436195]  [<ffffffff810a7b91>] wq_worker_sleeping+0x11/0x90
> > [ 2670.443065]  [<ffffffff816f2b89>] __schedule+0x4a9/0x740
> > [ 2670.449512]  [<ffffffff816f2e49>] schedule+0x29/0x70
> > [ 2670.455693]  [<ffffffff8108cb47>] do_exit+0x6b7/0xa30
> > [ 2670.461709]  [<ffffffff816f758c>] oops_end+0x9c/0xe0
> > [ 2670.467649]  [<ffffffff81017fcb>] die+0x4b/0x70
> > [ 2670.473359]  [<ffffffff816f6e20>] do_trap+0x60/0x170
> > [ 2670.479445]  [<ffffffff810150aa>] do_invalid_op+0xaa/0xe0
> > [ 2670.485816]  [<ffffffffa00edfe4>] ? qlt_pci_map_calc_cnt+0x164/0x1a0 [qla2xxx]
> > [ 2670.494264]  [<ffffffff8117e148>] ? __alloc_pages_nodemask+0x188/0xb60
> > [ 2670.501765]  [<ffffffff81700c9e>] invalid_op+0x1e/0x30
> > [ 2670.508219]  [<ffffffffa00edfe4>] ? qlt_pci_map_calc_cnt+0x164/0x1a0 [qla2xxx]
> > [ 2670.516726]  [<ffffffffa00ef933>] qlt_rdy_to_xfer+0x93/0x210 [qla2xxx]
> > [ 2670.524462]  [<ffffffffa03c267c>] tcm_qla2xxx_write_pending+0x5c/0x60 [tcm_qla2xxx]
> > [ 2670.533387]  [<ffffffffa04ff677>] transport_generic_new_cmd+0x127/0x1a0 [target_core_mod]
> > [ 2670.542841]  [<ffffffffa04ff72b>] transport_handle_cdb_direct+0x3b/0x90 [target_core_mod]
> > [ 2670.552307]  [<ffffffffa04ff8c2>] target_submit_cmd_map_sgls+0x142/0x280 [target_core_mod]
> > [ 2670.561778]  [<ffffffffa04ffa59>] target_submit_cmd+0x59/0x60 [target_core_mod]
> > [ 2670.570364]  [<ffffffffa03c25ea>] tcm_qla2xxx_handle_cmd+0x5a/0x90 [tcm_qla2xxx]
> > [ 2670.578795]  [<ffffffffa00f0624>] qlt_do_work+0x2e4/0x460 [qla2xxx]
> > [ 2670.586343]  [<ffffffff810a6886>] process_one_work+0x176/0x430
> > [ 2670.593454]  [<ffffffff810a74db>] worker_thread+0x11b/0x3a0
> > [ 2670.600058]  [<ffffffff810a73c0>] ? rescuer_thread+0x370/0x370
> > [ 2670.606910]  [<ffffffff810ae211>] kthread+0xe1/0x100
> > [ 2670.612885]  [<ffffffff810ae130>] ? insert_kthread_work+0x40/0x40
> > [ 2670.620058]  [<ffffffff816ff4bc>] ret_from_fork+0x7c/0xb0
> > [ 2670.626578]  [<ffffffff810ae130>] ? insert_kthread_work+0x40/0x40
> > [ 2670.633697] Code: 00 48 89 e5 5d 48 8b 40 c8 48 c1 e8 02 83 e0 01 c3 66 2e
> > 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 87 28 04 00 00 55 48 89 e5 <48> 8b
> > 40 d8 5d c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00
> > [ 2670.656112] RIP  [<ffffffff810ae800>] kthread_data+0x10/0x20
> > [ 2670.662623]  RSP <ffff8800d3eeb880>
> > [ 2670.666936] CR2: ffffffffffffffd8
> > [ 2670.671170] ---[ end trace ec1db6cbc3980107 ]---
> > [ 2670.676804] Fixing recursive fault but reboot is needed!
> > [ 2691.628643] ------------[ cut here ]------------
> > 
> > 
> > Looking at protocol traces on a Fibre Channel Protocol Analyzer, a Report
> > LUNs command requesting 0x1008 bytes of data (FCP_DL=0x1008) appears to
> > trigger the crash.
> > 
> > It is a rare target that has 0x1008 (5004) bytes of LUN data to return.
> > What should happen is the actual data available should be returned and a
> > residual under flag be set in the response along with the residual under
> > field being set to the number of bytes not returned.  This system
> > actually is occasionally attached to a large storage array and
> > presumably that is why the large buffer is allocated for LUNs responses.
> > 
> 
> AFAICT, the 'what should happen' is the current behavior in target core.
> 
> That is, the REPORT_LUNS response sets the underflow bit and associated
> residual length in target_complete_cmd_with_length() when the outgoing
> REPORT_LUNS payload is smaller than the received SCSI EDTL.
> 
> > Since many of the Qla2xxx operations seem to pad (with zeros) the
> > responses out to whatever the requested data length was, maybe this is
> > overrunning a 4k buffer somewhere?  Note, the responses should not be
> > padding and returning more data than what is actually available but this
> > is what we are seeing on the analyzer.  The data length on the request
> > is specifying a maximum to return, not how much has to be returned.
> > 
> 
> Ok, based on the comments below, I think this is a separate issue that
> is specific to qla2xxx fabric, but is not the cause of the above
> BUG_ON().
> 
> > So, this appears to be a combination of 2 problems.  1) Returns are
> > padding with zeros to the size of the request rather than truncating to
> > the size of actual data on hand. 2) Apparently a 4k buffer is being
> > overrun and crashing the system when a 0x1008 (4k + 8) Request for LUNs
> > command is received (this is speculation on my part).
> > 
> 
> So based upon the above trace, the BUG_ON() being triggered is in the
> WRITE code-path and doesn't appear related to a specific REPORT_LUNS
> payload size.
> 
> Namely, there is a check in qlt_pci_map_calc_cnt() that ensures that a
> non zero scatterlist count exists when a given command's scatterlists
> are mapped into PCI address space for receiving the incoming WRITE
> payload.
> 
> I'm thinking this could very well be some form of zero-length WRITE that
> is occurring immediately after the REPORT_LUNs.
> 
> Is anything else showing up on the trace after the REPORT_LUNs..?
> 
> > Note the target system only has 1 LUN and should never come close to
> > returning 4k of data.  What we actually see are 2 response PDUs with 2k
> > (LEX=0x0800) of data.  This is mostly 0's, only the first 16-24 bytes
> > has information.  These are followed by a response PDU with 8 bytes (of
> > zeros).  The system then crashes hard.
> > 
> > We have traces of the initiator system connected to a Hitachi FC disk
> > drive where the same Report LUNs command (w/FCP_DL=0x1008) receives a
> > single response with 16 bytes of data (Len=0x0010).  The drive functions
> > fine after this.
> > 
> > Looking at another initiator system and our LIO target, we see Requests
> > for LUNs asking for approximately 2k of data and the LIO-based target
> > system responds without crashing but does return approximately 2k of
> > data (mostly zeros).
> > 
> > Please let me know if I can provide any more information to facilitate
> > getting this fixed.  Access to the systems is extremely limited though.
> > 
> 
> Thanks.  I need to verify zero-length WRITEs work as expected first, and
> then will take a look at why REPORT_LUNS is returning the extra payload
> with qla2xxx.
> 

OK, I've been able to verify with v3.14.13 + qla2xxx export that
zero-length WRITEs and residual handling for REPORT_LUNS with a 5004
byte transfer length are working as expected.

# sg_raw /dev/sg9 2a 00 00 00 00 00 00 00 00 00
SCSI Status: Good 

# sg_luns -m 5004 -v /dev/sdj
    report luns cdb: a0 00 00 00 00 00 00 00 13 8c 00 00 
    report luns: pass-through requested 5004 bytes but got 72 bytes
Lun list length = 8 which imples 1 lun entry
Report luns [select_report=0x0]:
    0000000000000000

Looking at the v3.14.y change-log, Roland's patch below to fix the
residual handling went in for v3.14.9, and should address the zero
padding issue you observed with the FC analyzer.

target: Report correct response length for some commands
https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?h=linux-3.14.y&id=9a7f545e75cef29dbdc4c585f3ae27c5f7f4ea48

It's still unclear if this is what triggered the BUG_ON() in
qlt_pci_map_calc_cnt() you originally reported, but at this point it
would be a good idea to retest with >= v3.14.13, preferably with extra
debug code in place to dump se_cmd if the same BUG_ON() triggers again.

Thanks,

--nab

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html