On Wed, May 14, 2014 at 2:39 PM, Nicholas A. Bellinger <nab@xxxxxxxxxxxxxxx> wrote: > On Tue, 2014-05-13 at 14:52 -0700, Jun Wu wrote: >> Hi Nicholas, >> >> We had to roll back system from 3.14 to 3.11 due to compile issues of >> our software. So I am not able to verify your fix at this point. > > That is unfortunate your stuck on a now unsupported stable kernel. > There are some other libfc related fixes that have gone in during the > v3.13 timeframe, so I'd strongly recommend upgrading to at least that > stable version. > > In any event, I'll be pushing that particular >= v3.13.y patch anyways, > as it's a obvious regression bugfix for percpu-ida pre-allocation. > >> I ran the same tests on 3.11 instead. >> >> In one case the target crashed with following message: >> >> May 13 13:06:25 poc2 kernel: BUG: unable to handle kernel paging >> request at ffffffffffffffa4 >> May 13 13:06:25 poc2 kernel: IP: [<ffffffff8164ac07>] >> _raw_spin_lock_bh+0x17/0x40 >> May 13 13:06:25 poc2 kernel: PGD 1c0f067 PUD 1c11067 PMD 0 >> May 13 13:06:25 poc2 kernel: Oops: 0002 [#1] SMP >> May 13 13:06:25 poc2 kernel: Modules linked in: fcoe libfcoe 8021q > garp mrp tcm_fc libfc scsi_transport_fc scsi_tgt target_core_pscsi >> target_core_file target_core_iblock iscsi_target_mod target_core_mod >> ip6t_rpfilter ip6t_REJECT xt_conntrack ebtable_nat ebtable_broute >> bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 >> nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security >> ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 >> nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle >> iptable_security iptable_raw nfsd auth_rpcgss nfs_acl lockd sunrpc >> ixgbe mdio igb ptp pps_core serio_raw ses enclosure iTCO_wdt >> iTCO_vendor_support lpc_ich mfd_core shpchp i2c_i801 coretemp >> kvm_intel kvm crc32c_intel microcode i7core_edac ioatdma acpi_cpufreq >> edac_core dca mperf radeon i2c_algo_bit >> May 13 13:06:25 poc2 kernel: drm_kms_helper ttm drm ata_generic >> i2c_core pata_acpi pata_jmicron aacraid >> May 13 13:06:25 poc2 kernel: CPU: 0 PID: 1810 Comm: kworker/0:0 Not >> tainted 3.11.10-301.fc20.x86_64 #1 >> May 13 13:06:25 poc2 kernel: Hardware name: Supermicro X8DTN/X8DTN, >> BIOS 2.1c 10/28/2011 >> May 13 13:06:25 poc2 kernel: Workqueue: target_completion target_complete_ok_work [target_core_mod] >> May 13 13:06:25 poc2 kernel: task: ffff88032c5096e0 ti: ffff88031bb78000 task.ti: ffff88031bb78000 >> May 13 13:06:25 poc2 kernel: RIP: 0010:[<ffffffff8164ac07>] [<ffffffff8164ac07>] _raw_spin_lock_bh+0x17/0x40 >> May 13 13:06:25 poc2 kernel: RSP: 0018:ffff88031bb79cf0 EFLAGS: 00010206 >> May 13 13:06:25 poc2 kernel: RAX: 0000000000000100 RBX: ffffffffffffffa4 RCX: 0000000000000000 >> May 13 13:06:25 poc2 kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffffffffa4 >> May 13 13:06:25 poc2 kernel: RBP: ffff88031bb79cf8 R08: 00000000ffffffff R09: ffff88031a37f678 >> May 13 13:06:25 poc2 kernel: R10: 0000000000000001 R11: 0000000000000044 R12: 0000000000000000 >> May 13 13:06:25 poc2 kernel: R13: ffff88031a37f678 R14: ffff88062d9fd6c8 R15: ffff88032c6da05c >> May 13 13:06:25 poc2 kernel: FS: 0000000000000000(0000) GS:ffff880333c00000(0000) knlGS:0000000000000000 >> May 13 13:06:25 poc2 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >> May 13 13:06:25 poc2 kernel: CR2: ffffffffffffffa4 CR3: 0000000001c0c000 CR4: 00000000000007f0 >> May 13 13:06:25 poc2 kernel: Stack: >> May 13 13:06:25 poc2 kernel: ffffffffffffffa4 ffff88031bb79d18 ffffffffa0594d2b ffff880328d13410 >> May 13 13:06:25 poc2 kernel: ffff88031a37c200 ffff88031bb79d58 ffffffffa05356f2 0000000000000018 >> May 13 13:06:25 poc2 kernel: ffff88062cea8800 0000000000000000 ffffea000c8eb640 0000000000000000 >> May 13 13:06:25 poc2 kernel: Call Trace: >> May 13 13:06:25 poc2 kernel: [<ffffffffa0594d2b>] fc_seq_start_next+0x1b/0x40 [libfc] >> May 13 13:06:25 poc2 kernel: [<ffffffffa05356f2>] ft_queue_status+0xf2/0x220 [tcm_fc] >> May 13 13:06:25 poc2 kernel: [<ffffffffa0536972>] ft_queue_data_in+0x72/0x5a0 [tcm_fc] >> May 13 13:06:25 poc2 kernel: [<ffffffffa04f57ba>] target_complete_ok_work+0x14a/0x2b0 [target_core_mod] >> May 13 13:06:25 poc2 kernel: [<ffffffff810810f5>] process_one_work+0x175/0x430 >> May 13 13:06:25 poc2 kernel: [<ffffffff81081d1b>] worker_thread+0x11b/0x3a0 >> May 13 13:06:25 poc2 kernel: [<ffffffff81081c00>] ? rescuer_thread+0x340/0x340 >> May 13 13:06:25 poc2 kernel: [<ffffffff81088660>] kthread+0xc0/0xd0 >> May 13 13:06:25 poc2 kernel: [<ffffffff810885a0>] ? insert_kthread_work+0x40/0x40 >> May 13 13:06:25 poc2 kernel: [<ffffffff8165332c>] ret_from_fork+0x7c/0xb0 >> May 13 13:06:25 poc2 kernel: [<ffffffff810885a0>] ? insert_kthread_work+0x40/0x40 >> May 13 13:06:25 poc2 kernel: Code: 1f 44 00 00 f3 90 0f b6 07 38 d0 75 >> f7 5d c3 0f 1f 44 00 00 66 66 66 66 90 55 48 89 e5 53 48 89 fb e8 7e >> 05 a2 ff b8 00 01 00 00 <f0> 66 0f c1 03 0f b6 d4 38 c2 74 0e 0f 1f 44 >> 00 00 f3 90 0f b6 >> May 13 13:06:25 poc2 kernel: RIP [<ffffffff8164ac07>] _raw_spin_lock_bh+0x17/0x40 >> > > So before we start debugging again, please confirm that this is a > *completely* stock v3.11.10 build, and that your not building > out-of-tree target modules again. Yes we installed official Fedora 20 iso image and then yum install targetcli modprobe fcoe Kernel version is 3.11.10-301.fc20.x86_64 which is a stock kernel distribution. > >> >> In another case, the initiator crashed with: >> >> May 13 12:00:47 poc1 kernel: [ 4086.708455] WARNING: CPU: 1 PID: 1869 >> at lib/list_debug.c:62 __list_del_entry+0x82/0xd0() >> May 13 12:00:47 poc1 kernel: [ 4086.708459] list_del corruption. >> next->prev should be ffff88061dab0318, but was ffff88061d257318 >> May 13 12:00:47 poc1 kernel: [ 4086.708461] Modules linked in: fcoe >> libfcoe 8021q garp mrp tcm_fc libfc scsi_transport_fc scsi_tgt >> target_core_pscsi target_core_file target_core_iblock iscsi_target_mod >> target_core_mod nf_conntrack_netbios_ns nf_conntrack_broadcast >> ipt_MASQUERADE ip6t_REJECT xt_conntrack ebtable_nat ebtable_broute >> bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 >> nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security >> ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 >> nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle >> iptable_security iptable_raw coretemp kvm_intel kvm crc32c_intel >> iTCO_wdt iTCO_vendor_support microcode serio_raw i2c_i801 ses igb >> enclosure lpc_ich mfd_core ixgbe ptp pps_core mdio i7core_edac ioatdma >> edac_core dca shpchp acpi_cpufreq mperf nfsd auth_rpcgss nfs_acl lockd >> sunrpc radeon i2c_algo_bit drm_kms_helper ttm drm ata_generic i2c_core >> pata_acpi pata_jmicron aacraid >> May 13 12:00:47 poc1 kernel: [ 4086.708556] CPU: 1 PID: 1869 Comm: >> fcoethread/1 Not tainted 3.11.10-301.fc20.x86_64 #1 >> May 13 12:00:47 poc1 kernel: [ 4086.708558] Hardware name: Supermicro >> X8DTN/X8DTN, BIOS 2.1c 10/28/2011 >> May 13 12:00:47 poc1 kernel: [ 4086.708561] 0000000000000009 ffff8806129dfb40 ffffffff816441db ffff8806129dfb88 >> May 13 12:00:47 poc1 kernel: [ 4086.708569] ffff8806129dfb78 ffffffff8106715d ffff88061dab0318 ffff88061dab0a00 >> May 13 12:00:47 poc1 kernel: [ 4086.708576] 0000000000000286 ffff880c1b5e4388 0000000000000030 ffff8806129dfbd8 >> May 13 12:00:47 poc1 kernel: [ 4086.708582] Call Trace: >> May 13 12:00:47 poc1 kernel: [ 4086.708592] [<ffffffff816441db>] dump_stack+0x45/0x56 >> May 13 12:00:47 poc1 kernel: [ 4086.708598] [<ffffffff8106715d>] warn_slowpath_common+0x7d/0xa0 >> May 13 12:00:47 poc1 kernel: [ 4086.708602] [<ffffffff810671cc>] warn_slowpath_fmt+0x4c/0x50 >> May 13 12:00:47 poc1 kernel: [ 4086.708608] [<ffffffff81311dc2>] __list_del_entry+0x82/0xd0 >> May 13 12:00:47 poc1 kernel: [ 4086.708613] [<ffffffff81311e1d>] list_del+0xd/0x30 >> May 13 12:00:47 poc1 kernel: [ 4086.708624] [<ffffffffa05de23c>] fc_io_compl+0x1cc/0x710 [libfc] >> May 13 12:00:47 poc1 kernel: [ 4086.708633] [<ffffffffa05de7df>] fc_fcp_complete_locked+0x5f/0x1a0 [libfc] >> May 13 12:00:47 poc1 kernel: [ 4086.708642] [<ffffffffa05dfac9>] fc_fcp_resp.isra.22+0x79/0x2f0 [libfc] >> May 13 12:00:47 poc1 kernel: [ 4086.708651] [<ffffffff810a2a33>] ? load_balance+0xe3/0x740 >> May 13 12:00:47 poc1 kernel: [ 4086.708660] [<ffffffffa05e0424>] fc_fcp_recv+0x6e4/0xef0 [libfc] >> May 13 12:00:47 poc1 kernel: [ 4086.708666] [<ffffffff810115ce>] ? __switch_to+0x13e/0x4b0 >> May 13 12:00:47 poc1 kernel: [ 4086.708673] [<ffffffff8164aab5>] ? _raw_spin_unlock_bh+0x15/0x20 >> May 13 12:00:47 poc1 kernel: [ 4086.708682] [<ffffffffa05dfd40>] ? fc_fcp_resp.isra.22+0x2f0/0x2f0 [libfc] >> May 13 12:00:47 poc1 kernel: [ 4086.708690] [<ffffffffa05d421b>] fc_exch_recv+0x8eb/0xd70 [libfc] >> May 13 12:00:47 poc1 kernel: [ 4086.708695] [<ffffffffa0613299>] fcoe_percpu_receive_thread+0x299/0x540 [fcoe] >> May 13 12:00:47 poc1 kernel: [ 4086.708699] [<ffffffffa0613000>] ? fcoe_set_port_id+0x50/0x50 [fcoe] >> May 13 12:00:47 poc1 kernel: [ 4086.708705] [<ffffffff81088660>] kthread+0xc0/0xd0 >> May 13 12:00:47 poc1 kernel: [ 4086.708710] [<ffffffff810885a0>] ? insert_kthread_work+0x40/0x40 >> May 13 12:00:47 poc1 kernel: [ 4086.708717] [<ffffffff8165332c>] ret_from_fork+0x7c/0xb0 >> May 13 12:00:47 poc1 kernel: [ 4086.708723] [<ffffffff810885a0>] ? insert_kthread_work+0x40/0x40 >> May 13 12:00:47 poc1 kernel: [ 4086.708728] ---[ end trace 61dc774d1f379191 ]--- >> > > No idea on the initiator side issue. Intel folks..? (Adding > openfcoe-dev CC') > > --nab > >> [root@poc1 log]# lspci | grep 82599 >> 08:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit >> SFI/SFP+ Network Connection (rev 01) >> >> [root@poc1 log]# uname -a >> Linux poc1 3.11.10-301.fc20.x86_64 #1 SMP Thu Dec 5 14:01:17 UTC 2013 >> x86_64 x86_64 x86_64 GNU/Linux >> >> Thanks, >> > > -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html