Re: System crashes with increased drive count

Jun Wu <jwu@xxxxxxxxxxxx> · Tue, 13 May 2014 14:52:36 -0700

Hi Nicholas,

We had to roll back system from 3.14 to 3.11 due to compile issues of
our software. So I am not able to verify your fix at this point.
I ran the same tests on 3.11 instead.

In one case the target crashed with following message:

May 13 13:06:25 poc2 kernel: BUG: unable to handle kernel paging
request at ffffffffffffffa4
May 13 13:06:25 poc2 kernel: IP: [<ffffffff8164ac07>]
_raw_spin_lock_bh+0x17/0x40
May 13 13:06:25 poc2 kernel: PGD 1c0f067 PUD 1c11067 PMD 0
May 13 13:06:25 poc2 kernel: Oops: 0002 [#1] SMP
May 13 13:06:25 poc2 kernel: Modules linked in: fcoe libfcoe 8021q
garp mrp tcm_fc libfc scsi_transport_fc scsi_tgt target_core_pscsi
target_core_file target_core_iblock iscsi_target_mod target_core_mod
ip6t_rpfilter ip6t_REJECT xt_conntrack ebtable_nat ebtable_broute
bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6
nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security
ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4
nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle
iptable_security iptable_raw nfsd auth_rpcgss nfs_acl lockd sunrpc
ixgbe mdio igb ptp pps_core serio_raw ses enclosure iTCO_wdt
iTCO_vendor_support lpc_ich mfd_core shpchp i2c_i801 coretemp
kvm_intel kvm crc32c_intel microcode i7core_edac ioatdma acpi_cpufreq
edac_core dca mperf radeon i2c_algo_bit
May 13 13:06:25 poc2 kernel: drm_kms_helper ttm drm ata_generic
i2c_core pata_acpi pata_jmicron aacraid
May 13 13:06:25 poc2 kernel: CPU: 0 PID: 1810 Comm: kworker/0:0 Not
tainted 3.11.10-301.fc20.x86_64 #1
May 13 13:06:25 poc2 kernel: Hardware name: Supermicro X8DTN/X8DTN,
BIOS 2.1c       10/28/2011
May 13 13:06:25 poc2 kernel: Workqueue: target_completion
target_complete_ok_work [target_core_mod]
May 13 13:06:25 poc2 kernel: task: ffff88032c5096e0 ti:
ffff88031bb78000 task.ti: ffff88031bb78000
May 13 13:06:25 poc2 kernel: RIP: 0010:[<ffffffff8164ac07>]
[<ffffffff8164ac07>] _raw_spin_lock_bh+0x17/0x40
May 13 13:06:25 poc2 kernel: RSP: 0018:ffff88031bb79cf0  EFLAGS: 00010206
May 13 13:06:25 poc2 kernel: RAX: 0000000000000100 RBX:
ffffffffffffffa4 RCX: 0000000000000000
May 13 13:06:25 poc2 kernel: RDX: 0000000000000000 RSI:
0000000000000000 RDI: ffffffffffffffa4
May 13 13:06:25 poc2 kernel: RBP: ffff88031bb79cf8 R08:
00000000ffffffff R09: ffff88031a37f678
May 13 13:06:25 poc2 kernel: R10: 0000000000000001 R11:
0000000000000044 R12: 0000000000000000
May 13 13:06:25 poc2 kernel: R13: ffff88031a37f678 R14:
ffff88062d9fd6c8 R15: ffff88032c6da05c
May 13 13:06:25 poc2 kernel: FS:  0000000000000000(0000)
GS:ffff880333c00000(0000) knlGS:0000000000000000
May 13 13:06:25 poc2 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
May 13 13:06:25 poc2 kernel: CR2: ffffffffffffffa4 CR3:
0000000001c0c000 CR4: 00000000000007f0
May 13 13:06:25 poc2 kernel: Stack:
May 13 13:06:25 poc2 kernel: ffffffffffffffa4 ffff88031bb79d18
ffffffffa0594d2b ffff880328d13410
May 13 13:06:25 poc2 kernel: ffff88031a37c200 ffff88031bb79d58
ffffffffa05356f2 0000000000000018
May 13 13:06:25 poc2 kernel: ffff88062cea8800 0000000000000000
ffffea000c8eb640 0000000000000000
May 13 13:06:25 poc2 kernel: Call Trace:
May 13 13:06:25 poc2 kernel: [<ffffffffa0594d2b>]
fc_seq_start_next+0x1b/0x40 [libfc]
May 13 13:06:25 poc2 kernel: [<ffffffffa05356f2>]
ft_queue_status+0xf2/0x220 [tcm_fc]
May 13 13:06:25 poc2 kernel: [<ffffffffa0536972>]
ft_queue_data_in+0x72/0x5a0 [tcm_fc]
May 13 13:06:25 poc2 kernel: [<ffffffffa04f57ba>]
target_complete_ok_work+0x14a/0x2b0 [target_core_mod]
May 13 13:06:25 poc2 kernel: [<ffffffff810810f5>] process_one_work+0x175/0x430
May 13 13:06:25 poc2 kernel: [<ffffffff81081d1b>] worker_thread+0x11b/0x3a0
May 13 13:06:25 poc2 kernel: [<ffffffff81081c00>] ? rescuer_thread+0x340/0x340
May 13 13:06:25 poc2 kernel: [<ffffffff81088660>] kthread+0xc0/0xd0
May 13 13:06:25 poc2 kernel: [<ffffffff810885a0>] ?
insert_kthread_work+0x40/0x40
May 13 13:06:25 poc2 kernel: [<ffffffff8165332c>] ret_from_fork+0x7c/0xb0
May 13 13:06:25 poc2 kernel: [<ffffffff810885a0>] ?
insert_kthread_work+0x40/0x40
May 13 13:06:25 poc2 kernel: Code: 1f 44 00 00 f3 90 0f b6 07 38 d0 75
f7 5d c3 0f 1f 44 00 00 66 66 66 66 90 55 48 89 e5 53 48 89 fb e8 7e
05 a2 ff b8 00 01 00 00 <f0> 66 0f c1 03 0f b6 d4 38 c2 74 0e 0f 1f 44
00 00 f3 90 0f b6
May 13 13:06:25 poc2 kernel: RIP  [<ffffffff8164ac07>]
_raw_spin_lock_bh+0x17/0x40

In another case, the initiator crashed with:

May 13 12:00:47 poc1 kernel: [ 4086.708455] WARNING: CPU: 1 PID: 1869
at lib/list_debug.c:62 __list_del_entry+0x82/0xd0()
May 13 12:00:47 poc1 kernel: [ 4086.708459] list_del corruption.
next->prev should be ffff88061dab0318, but was ffff88061d257318
May 13 12:00:47 poc1 kernel: [ 4086.708461] Modules linked in: fcoe
libfcoe 8021q garp mrp tcm_fc libfc scsi_transport_fc scsi_tgt
target_core_pscsi target_core_file target_core_iblock iscsi_target_mod
target_core_mod nf_conntrack_netbios_ns nf_conntrack_broadcast
ipt_MASQUERADE ip6t_REJECT xt_conntrack ebtable_nat ebtable_broute
bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6
nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security
ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4
nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle
iptable_security iptable_raw coretemp kvm_intel kvm crc32c_intel
iTCO_wdt iTCO_vendor_support microcode serio_raw i2c_i801 ses igb
enclosure lpc_ich mfd_core ixgbe ptp pps_core mdio i7core_edac ioatdma
edac_core dca shpchp acpi_cpufreq mperf nfsd auth_rpcgss nfs_acl lockd
sunrpc radeon i2c_algo_bit drm_kms_helper ttm drm ata_generic i2c_core
pata_acpi pata_jmicron aacraid
May 13 12:00:47 poc1 kernel: [ 4086.708556] CPU: 1 PID: 1869 Comm:
fcoethread/1 Not tainted 3.11.10-301.fc20.x86_64 #1
May 13 12:00:47 poc1 kernel: [ 4086.708558] Hardware name: Supermicro
X8DTN/X8DTN, BIOS 2.1c       10/28/2011
May 13 12:00:47 poc1 kernel: [ 4086.708561]  0000000000000009
ffff8806129dfb40 ffffffff816441db ffff8806129dfb88
May 13 12:00:47 poc1 kernel: [ 4086.708569]  ffff8806129dfb78
ffffffff8106715d ffff88061dab0318 ffff88061dab0a00
May 13 12:00:47 poc1 kernel: [ 4086.708576]  0000000000000286
ffff880c1b5e4388 0000000000000030 ffff8806129dfbd8
May 13 12:00:47 poc1 kernel: [ 4086.708582] Call Trace:
May 13 12:00:47 poc1 kernel: [ 4086.708592]  [<ffffffff816441db>]
dump_stack+0x45/0x56
May 13 12:00:47 poc1 kernel: [ 4086.708598]  [<ffffffff8106715d>]
warn_slowpath_common+0x7d/0xa0
May 13 12:00:47 poc1 kernel: [ 4086.708602]  [<ffffffff810671cc>]
warn_slowpath_fmt+0x4c/0x50
May 13 12:00:47 poc1 kernel: [ 4086.708608]  [<ffffffff81311dc2>]
__list_del_entry+0x82/0xd0
May 13 12:00:47 poc1 kernel: [ 4086.708613]  [<ffffffff81311e1d>]
list_del+0xd/0x30
May 13 12:00:47 poc1 kernel: [ 4086.708624]  [<ffffffffa05de23c>]
fc_io_compl+0x1cc/0x710 [libfc]
May 13 12:00:47 poc1 kernel: [ 4086.708633]  [<ffffffffa05de7df>]
fc_fcp_complete_locked+0x5f/0x1a0 [libfc]
May 13 12:00:47 poc1 kernel: [ 4086.708642]  [<ffffffffa05dfac9>]
fc_fcp_resp.isra.22+0x79/0x2f0 [libfc]
May 13 12:00:47 poc1 kernel: [ 4086.708651]  [<ffffffff810a2a33>] ?
load_balance+0xe3/0x740
May 13 12:00:47 poc1 kernel: [ 4086.708660]  [<ffffffffa05e0424>]
fc_fcp_recv+0x6e4/0xef0 [libfc]
May 13 12:00:47 poc1 kernel: [ 4086.708666]  [<ffffffff810115ce>] ?
__switch_to+0x13e/0x4b0
May 13 12:00:47 poc1 kernel: [ 4086.708673]  [<ffffffff8164aab5>] ?
_raw_spin_unlock_bh+0x15/0x20
May 13 12:00:47 poc1 kernel: [ 4086.708682]  [<ffffffffa05dfd40>] ?
fc_fcp_resp.isra.22+0x2f0/0x2f0 [libfc]
May 13 12:00:47 poc1 kernel: [ 4086.708690]  [<ffffffffa05d421b>]
fc_exch_recv+0x8eb/0xd70 [libfc]
May 13 12:00:47 poc1 kernel: [ 4086.708695]  [<ffffffffa0613299>]
fcoe_percpu_receive_thread+0x299/0x540 [fcoe]
May 13 12:00:47 poc1 kernel: [ 4086.708699]  [<ffffffffa0613000>] ?
fcoe_set_port_id+0x50/0x50 [fcoe]
May 13 12:00:47 poc1 kernel: [ 4086.708705]  [<ffffffff81088660>]
kthread+0xc0/0xd0
May 13 12:00:47 poc1 kernel: [ 4086.708710]  [<ffffffff810885a0>] ?
insert_kthread_work+0x40/0x40
May 13 12:00:47 poc1 kernel: [ 4086.708717]  [<ffffffff8165332c>]
ret_from_fork+0x7c/0xb0
May 13 12:00:47 poc1 kernel: [ 4086.708723]  [<ffffffff810885a0>] ?
insert_kthread_work+0x40/0x40
May 13 12:00:47 poc1 kernel: [ 4086.708728] ---[ end trace 61dc774d1f379191 ]---

[root@poc1 log]# lspci | grep 82599
08:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit
SFI/SFP+ Network Connection (rev 01)

[root@poc1 log]# uname -a
Linux poc1 3.11.10-301.fc20.x86_64 #1 SMP Thu Dec 5 14:01:17 UTC 2013
x86_64 x86_64 x86_64 GNU/Linux

Thanks,

Jun

On Mon, May 12, 2014 at 12:34 PM, Nicholas A. Bellinger
<nab@xxxxxxxxxxxxxxx> wrote:
> On Thu, 2014-05-08 at 19:17 -0700, Jun Wu wrote:
>> We are running in system crashes as number of drive under test
>> increases. The test configuration is one initiator as server running
>> fio sessions to remote drives on a target server via fcoe vn2vn. Both
>> servers running fedora 20 (kernel 3.14.2-200). Running fio sessions up
>> to 7 remote drives works but target machines hangs when drive count
>> increased to 8. The system crashes are very repeatable and duplicated
>> on RHEL 6.5. Following are error messages on  target server:
>>
>>
>> [ 1503.737314] BUG: unable to handle kernel NULL pointer dereference
>> at 0000000000000048
>> [ 1503.737442] IP: [<ffffffffa0610885>] ft_sess_put+0x5/0x30 [tcm_fc]
>> [ 1503.737540] PGD 0
>> [ 1503.737575] Oops: 0000 [#1] SMP
>> [ 1503.737631] Modules linked in: tcm_fc target_core_pscsi
>> target_core_file target_core_iblock iscsi_target_mod target_core_mod
>> fcoe libfcoe libfc scsi_transport_fc scsi_tgt 8021q garp mrp fuse
>> ip6t_rpfilter ip6t_REJECT xt_conntrack ebtable_nat ebtable_broute
>> bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6
>> nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security
>> ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4
>> nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle
>> iptable_security iptable_raw coretemp iTCO_wdt kvm_intel kvm gpio_ich
>> iTCO_vendor_support ses crc32c_intel tpm_tis enclosure i7core_edac
>> ioatdma edac_core shpchp serio_raw tpm lpc_ich mfd_core i2c_i801
>> microcode acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd sunrpc radeon
>> drm_kms_helper ttm
>> [ 1503.738933]  igb ixgbe drm ata_generic mdio ptp pata_acpi pps_core
>> pata_jmicron i2c_algo_bit aacraid dca i2c_core
>> [ 1503.739118] CPU: 5 PID: 6537 Comm: kworker/5:4 Not tainted
>> 3.14.2-200.fc20.x86_64 #1
>> [ 1503.739225] Hardware name: Supermicro X8DTN/X8DTN, BIOS 2.1c       10/28/2011
>> [ 1503.739338] Workqueue: target_completion target_complete_ok_work
>> [target_core_mod]
>> [ 1503.739449] task: ffff88062071d580 ti: ffff88061a322000 task.ti:
>> ffff88061a322000
>> [ 1503.739553] RIP: 0010:[<ffffffffa0610885>]  [<ffffffffa0610885>]
>> ft_sess_put+0x5/0x30 [tcm_fc]
>> [ 1503.739681] RSP: 0018:ffff88061a323ce8  EFLAGS: 00010016
>> [ 1503.739755] RAX: 0000000000000000 RBX: ffff880304a23498 RCX: 0000000000009010
>> [ 1503.739853] RDX: 0000000000009010 RSI: 00000000000000cb RDI: 0000000000000000
>> [ 1503.739953] RBP: ffff88061a323d08 R08: ffff88031c4c6500 R09: 000000018020000f
>> [ 1503.740051] R10: ffffffff815cfe87 R11: ffffea000c713180 R12: ffff88031c4c6500
>> [ 1503.740150] R13: ffff88031f7c1f80 R14: ffff88031f7c1fe8 R15: 0000000000000000
>> [ 1503.740250] FS:  0000000000000000(0000) GS:ffff88063fc20000(0000)
>> knlGS:0000000000000000
>> [ 1503.740363] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>> [ 1503.740443] CR2: 0000000000000048 CR3: 0000000001c0c000 CR4: 00000000000007e0
>> [ 1503.740541] Stack:
>> [ 1503.740572]  ffffffffa060e058 ffff880304a23568 ffff88031f7c1f80
>> ffff880304a234a8
>> [ 1503.740692]  ffff88061a323d18 ffffffffa060e5e2 ffff88061a323d40
>> ffffffffa05cff42
>> [ 1503.740812]  ffff880304a234a8 ffff880304a23568 0000000000000246
>> ffff88061a323d70
>> [ 1503.740931] Call Trace:
>> [ 1503.740969]  [<ffffffffa060e058>] ? ft_free_cmd+0x58/0x60 [tcm_fc]
>> [ 1503.741057]  [<ffffffffa060e5e2>] ft_release_cmd+0x12/0x20 [tcm_fc]
>> [ 1503.741150]  [<ffffffffa05cff42>] target_release_cmd_kref+0x52/0x80
>> [target_core_mod]
>> [ 1503.741264]  [<ffffffffa05d1bd3>] transport_release_cmd+0xd3/0xf0
>> [target_core_mod]
>> [ 1503.741377]  [<ffffffffa05d1c28>]
>> transport_generic_free_cmd+0x38/0x250 [target_core_mod]
>> [ 1503.741491]  [<ffffffffa060e600>] ft_check_stop_free+0x10/0x20 [tcm_fc]
>> [ 1503.741590]  [<ffffffffa05cfe32>]
>> transport_cmd_check_stop+0xc2/0x140 [target_core_mod]
>> [ 1503.741708]  [<ffffffffa05d3a97>]
>> target_complete_ok_work+0xe7/0x2d0 [target_core_mod]
>> [ 1503.741824]  [<ffffffff810a6886>] process_one_work+0x176/0x430
>> [ 1503.741907]  [<ffffffff810a74db>] worker_thread+0x11b/0x3a0
>> [ 1503.741985]  [<ffffffff810a73c0>] ? rescuer_thread+0x370/0x370
>> [ 1503.742069]  [<ffffffff810ae211>] kthread+0xe1/0x100
>> [ 1503.742138]  [<ffffffff810ae130>] ? insert_kthread_work+0x40/0x40
>> [ 1503.742227]  [<ffffffff816fef7c>] ret_from_fork+0x7c/0xb0
>> [ 1503.746668]  [<ffffffff810ae130>] ? insert_kthread_work+0x40/0x40
>> [ 1503.751104] Code: 48 89 f0 48 89 c7 89 d6 48 89 e5 48 8b 49 10 48
>> 89 ca e8 4f ed ff ff 5d c3 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 66
>> 66 66 66 90 <8b> 47 48 85 c0 74 22 48 8d 47 48 f0 83 6f 48 01 74 09 c3
>> 0f 1f
>> [ 1503.760558] RIP  [<ffffffffa0610885>] ft_sess_put+0x5/0x30 [tcm_fc]
>> [ 1503.765145]  RSP <ffff88061a323ce8>
>> [ 1503.769687] CR2: 0000000000000048
>> [ 1503.789003] ---[ end trace c7457ccb45bf0bc9 ]---
>>
>>
>>
>
> The v3.14 OOPs above looks like a free-after-use regression from the
> v3.13 conversion to use percpu-ida for pre-allocation of ft_cmd
> descriptors.
>
> Here's the patch that I'm applying to address this specific bug in
> tcm_fc.  Please apply it and verify the fix on your end.
>
> From 1d8dc8a29cfa6d66e5068ab6dad3216fe218cc53 Mon Sep 17 00:00:00 2001
> From: Nicholas Bellinger <nab@xxxxxxxxxxxxxxx>
> Date: Mon, 12 May 2014 12:18:32 -0700
> Subject: [PATCH] tcm_fc: Fix free-after-use regression in ft_free_cmd
>
> This patch fixes a free-after-use regression in ft_free_cmd(), where
> percpu_ida_free() was incorrectly called to release the tag before
> ft_sess_put() is called to drop the session reference.
>
> Fix this bug by moving the percpu_ida_free() call after ft_free_cmd().
>
> The regression was originally introduced in v3.13-rc1 commit:
>
>   commit 5f544cfac956971099e906f94568bc3fd1a7108a
>   Author: Nicholas Bellinger <nab@xxxxxxxxxxxxx>
>   Date:   Mon Sep 23 12:12:42 2013 -0700
>
>       tcm_fc: Convert to per-cpu command map pre-allocation of ft_cmd
>
> Reported-by: Jun Wu <jwu@xxxxxxxxxxxx>
> Cc: Mark Rustad <mark.d.rustad@xxxxxxxxx>
> Cc: Robert Love <robert.w.love@xxxxxxxxx>
> Cc: <stable@xxxxxxxxxxxxxxx> #3.13+
> Signed-off-by: Nicholas Bellinger <nab@xxxxxxxxxxxxxxx>
> ---
>  drivers/target/tcm_fc/tfc_cmd.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/target/tcm_fc/tfc_cmd.c b/drivers/target/tcm_fc/tfc_cmd.c
> index 01cf37f..28fce39 100644
> --- a/drivers/target/tcm_fc/tfc_cmd.c
> +++ b/drivers/target/tcm_fc/tfc_cmd.c
> @@ -100,8 +100,8 @@ static void ft_free_cmd(struct ft_cmd *cmd)
>         if (fr_seq(fp))
>                 lport->tt.seq_release(fr_seq(fp));
>         fc_frame_free(fp);
> -       percpu_ida_free(&se_sess->sess_tag_pool, cmd->se_cmd.map_tag);
>         ft_sess_put(cmd->sess); /* undo get from lookup at recv */
> +       percpu_ida_free(&se_sess->sess_tag_pool, cmd->se_cmd.map_tag);
>  }
>
>  void ft_release_cmd(struct se_cmd *se_cmd)
> --
> 1.7.10.4
>
>> Before target hangs, a lot of messages as follows are printed out on
>> the initiator:
>>
>> fio: io_u error on file /dev/sdl: Input/output error
>>      read offset=1030152192, buflen=4096
>>
>> [ 3787.971900] sd 8:0:0:0: [sdl] Unhandled error code
>> [ 3787.971907] sd 8:0:0:0: [sdl]
>> [ 3787.971910] Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
>> [ 3787.971913] sd 8:0:0:0: [sdl] CDB:
>> [ 3787.971915] Read(10): 28 00 00 1e b3 70 00 00 08 00
>> [ 3787.971924] end_request: I/O error, dev sdl, sector 2012016
>>
>>
>
> Not sure what's going on here without more information.
>
>> Installation steps used:
>> yum install lldpad
>> yum install fcoe-utils
>> modprobe fcoe
>> yum install targetcli
>>
>> Before these tests, we also installed Redhat 6.5 and followed
>> instructions on https://www.open-fcoe.org/. On Redhat, I were only
>> able to run fio to 3 target drives. Using 4 target drives crashed the
>> target machine.
>>
>
> No idea without more info wrt RHEL 6.5, but it certainly doesn't have
> the v3.13+ specific percpu-ida regression from above.
>
> --nab
>
--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html