RE: tcm_fc+ libfcoe regression on v3.7-rc2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> 
> Hi MDR, Robert & Co,
> 
> During the process of updating target-pending.git/master to v3.7-rc2
> this afternoon, I noticed the following warnings below when using
> tcm_fc.
> 
> The Poison overwritten appears during each I/O, but the LUN SCAN + I/O
> are seem to be still working as expected..
> 
> AFAICT there has not been anything effecting tcm_fc that has gone in
> recently, so it looks like some type of libfcoe or libfc regression.
> 
> Any ideas where to start looking to track this down..?
Nick,

I am seeing somewhat similar but not the same starting from merge window before the 
rc1 tag but so far I was still not able to pin-down where it is and I am not able to reproduce the 
problem anymore. The problem was exposed when somehow the initiator was zoned with
SW target even though itself was not intended to involve the SW target. So I would like to know
if I can reproduce this in your setup to track it down. The bug was found during lldp enable/disable 
test w/ I/O running. From what I can tell, it was related to exchange release path that the reference
count on the exchange somehow is messed up. Originally, I was suspecting the cancel_delayed_work()
is always returning true even we have no work pending that may have caused us to underflow
the refcnt on exchange, but it was not the case.  While investigating that, one minor issue
was fc_exch_find() may return a valid exchange evne though the xid is not matching up, I have
a patch to fix that, however, the exchange pool must have already been messed up when that happens.

Anyway, I would like to mimic your setup to see if I can reproduce it.

The trace I had is pasted here FYI:
...
kernel: Pid: 5072, comm: kworker/u:7 Tainted: G        W    3.6.0-upstream-net-next-ixgbe-queue-x86_64-g0b
kernel: Call Trace:
kernel: [<ffffffff810541ff>] warn_slowpath_common+0x7f/0xc0
kernel: [<ffffffff810542f6>] warn_slowpath_fmt+0x46/0x50
kernel: [<ffffffff8126bb01>] __list_del_entry+0xa1/0xd0
kernel: [<ffffffff8126bb41>] list_del+0x11/0x40
kernel: [<ffffffffa03adfaf>] fc_exch_delete+0x6f/0xb0 [libfc]
kernel: [<ffffffffa03b1074>] fc_exch_timeout+0x124/0x150 [libfc]
kernel: [<ffffffff81070c27>] process_one_work+0x177/0x430
kernel: [<ffffffffa03b0f50>] ? fc_exch_rrq+0x220/0x220 [libfc]
kernel: [<ffffffff8107303e>] worker_thread+0x12e/0x380
kernel: [<ffffffff81072f10>] ? manage_workers+0x180/0x180
kernel: [<ffffffff810781ae>] kthread+0xce/0xe0
kernel: [<ffffffff815311c4>] kernel_thread_helper+0x4/0x10
kernel: [<ffffffff810780e0>] ? kthread_freezable_should_stop+0x70/0x70
kernel: [<ffffffff815311c0>] ? gs_change+0x13/0x13
kernel: ---[] end trace f4c13caf2990c079 ]---
kernel: ------------[] cut here ]------------
kernel: WARNING: at lib/list_debug.c:59 __list_del_entry+0xa1/0xd0()
kernel: Hardware name: PowerEdge T610
kernel: list_del corruption. prev->next should be ffff88031cfeb2e0, but was ffffe8ffffc80348
kernel: Pid: 5072, comm: kworker/u:7 Tainted: G        W    3.6.0-upstream-net-next-ixgbe-queue-x86_64-g0b
kernel: Call Trace:
kernel: [<ffffffff810541ff>] warn_slowpath_common+0x7f/0xc0
kernel: [<ffffffff810542f6>] warn_slowpath_fmt+0x46/0x50
kernel: [<ffffffff8126bb01>] __list_del_entry+0xa1/0xd0
kernel: [<ffffffff8126bb41>] list_del+0x11/0x40
kernel: [<ffffffffa03adfaf>] fc_exch_delete+0x6f/0xb0 [libfc]
kernel: [<ffffffffa03b1074>] fc_exch_timeout+0x124/0x150 [libfc]
kernel: [<ffffffff81070c27>] process_one_work+0x177/0x430
kernel: [<ffffffffa03b0f50>] ? fc_exch_rrq+0x220/0x220 [libfc]
kernel: [<ffffffff8107303e>] worker_thread+0x12e/0x380
kernel: [<ffffffff81072f10>] ? manage_workers+0x180/0x180
kernel: [<ffffffff810781ae>] kthread+0xce/0xe0
kernel: [<ffffffff815311c4>] kernel_thread_helper+0x4/0x10
kernel: [<ffffffff810780e0>] ? kthread_freezable_should_stop+0x70/0x70
kernel: [<ffffffff815311c0>] ? gs_change+0x13/0x13
kernel: ---[] end trace f4c13caf2990c07a ]---
kernel: ixgbe 0000:05:00.0: Multiqueue Enabled: Rx Queue count = 24, Tx Queue count = 24
kernel: ixgbe 0000:05:00.0 p3p1: detected SFP+: 5
kernel: ixgbe 0000:05:00.0 p3p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX
kernel: BUG: soft lockup - CPU#6 stuck for 23s! [kworker/u:2:390]
kernel: CPU 6
kernel: Pid: 390, comm: kworker/u:2 Tainted: G        W    3.6.0-upstream-net-next-ixgbe-queue-x86_64-g0bf
kernel: RIP: 0010:[<ffffffffa03afe6b>]  [<ffffffffa03afe6b>] fc_exch_reset+0x1b/0xf0 [libfc]
kernel: RSP: 0018:ffff880326293c90  EFLAGS: 00000286
kernel: RAX: ffff880326293fd8 RBX: ffff880326293c50 RCX: 0000000000b60300
kernel: RDX: ffff8803263e00a0 RSI: 0000000000000001 RDI: ffff8803263e0080
kernel: RBP: ffff880326293cb0 R08: 0000000000000004 R09: 0000000000000000
kernel: R10: 0000000000000014 R11: 0000000000000001 R12: ffff880326293c68
kernel: R13: ffff8803263e0100 R14: 0000000000000014 R15: ffff880326293c70
kernel: FS:  0000000000000000(0000) GS:ffff88032fc60000(0000) knlGS:0000000000000000
kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
kernel: CR2: 00007f91b6920000 CR3: 0000000001a0b000 CR4: 00000000000007e0
kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
kernel: Process kworker/u:2 (pid: 390, threadinfo ffff880326292000, task ffff880326219540)
kernel: Stack:
kernel: ffff8801a53a06c0 ffffe8ffffc80340 0000000000000000 ffff8803263e0080
kernel: ffff880326293d00 ffffffffa03affd7 ffff880326293d00 ffffffff00b60300
kernel: 000000000000002c ffff8801a85b0840 ffffffff81ae6620 ffff8801a53a06c0
kernel: Call Trace:
kernel: [<ffffffffa03affd7>] fc_exch_pool_reset+0x97/0xe0 [libfc]
kernel: [<ffffffffa03b0092>] fc_exch_mgr_reset+0x72/0xb0 [libfc]
kernel: [<ffffffffa03b8ce0>] fc_rport_work+0x120/0x630 [libfc]
kernel: [<ffffffff8106f8a2>] ? ftrace_raw_event_workqueue_execute_start+0xb2/0xc0
kernel: [<ffffffff81070c27>] process_one_work+0x177/0x430
kernel: [<ffffffffa03b8bc0>] ? fc_rport_recv_els_req+0x1d0/0x1d0 [libfc]
kernel: [<ffffffff8107303e>] worker_thread+0x12e/0x380
kernel: [<ffffffff81072f10>] ? manage_workers+0x180/0x180
kernel: [<ffffffff810781ae>] kthread+0xce/0xe0
kernel: [<ffffffff815311c4>] kernel_thread_helper+0x4/0x10
kernel: [<ffffffff810780e0>] ? kthread_freezable_should_stop+0x70/0x70
kernel: [<ffffffff815311c0>] ? gs_change+0x13/0x13
kernel: Code: c0 e8 0a 4c 17 e1 e9 2a ff ff ff 0f 1f 44 00 00 55 48 89 e5 48 83 ec 20 48 89 1c 24 4c 89 64
4 24 18 <66> 66 66 66 90 48 89 fb e8 88 77 17 e1 31 f6 48 89 df e8 0e ea
kernel: libfcoe: host3: Missing Discovery Advertisement for fab 20ac000dec96e941 count 1



> 
> Thanks,
> 
> --nab
> 
> [ 3930.920161]
> ===================================================================
> ==========
> [ 3930.929274] BUG libfc_em (Tainted: G    B       ): Poison overwritten
> [ 3930.936447] -----------------------------------------------------------------------------
> [ 3930.936447]
> [ 3930.947203] INFO: 0xffff8806e04f0004-0xffff8806e04f0004. First byte 0x6a
> instead of 0x6b
> [ 3930.956219] INFO: Allocated in mempool_alloc_slab+0x10/0x12 age=9 cpu=3
> pid=25401
> [ 3930.964556] INFO: Freed in mempool_free_slab+0x12/0x14 age=11 cpu=3
> pid=18291
> [ 3930.972506] INFO: Slab 0xffffea001b813c00 objects=42 used=42 fp=0x
> (null) flags=0x8000000000004080
> [ 3930.983360] INFO: Object 0xffff8806e04f0000 @offset=0
> fp=0xffff8806e04f3d80
> [ 3930.983360]
> [ 3930.992762] Object ffff8806e04f0000: 6b 6b 6b 6b 6a 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  kkkkjkkkkkkkkkkk
> [ 3931.003132] Object ffff8806e04f0010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  kkkkkkkkkkkkkkkk
> [ 3931.013500] Object ffff8806e04f0020: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  kkkkkkkkkkkkkkkk
> [ 3931.023868] Object ffff8806e04f0030: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  kkkkkkkkkkkkkkkk
> [ 3931.034235] Object ffff8806e04f0040: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  kkkkkkkkkkkkkkkk
> [ 3931.044605] Object ffff8806e04f0050: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  kkkkkkkkkkkkkkkk
> [ 3931.054973] Object ffff8806e04f0060: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  kkkkkkkkkkkkkkkk
> [ 3931.065341] Object ffff8806e04f0070: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  kkkkkkkkkkkkkkkk
> [ 3931.075709] Object ffff8806e04f0080: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  kkkkkkkkkkkkkkkk
> [ 3931.086080] Object ffff8806e04f0090: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  kkkkkkkkkkkkkkkk
> [ 3931.096448] Object ffff8806e04f00a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  kkkkkkkkkkkkkkkk
> [ 3931.106816] Object ffff8806e04f00b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  kkkkkkkkkkkkkkkk
> [ 3931.117184] Object ffff8806e04f00c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  kkkkkkkkkkkkkkkk
> [ 3931.127552] Object ffff8806e04f00d0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  kkkkkkkkkkkkkkkk
> [ 3931.137921] Object ffff8806e04f00e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  kkkkkkkkkkkkkkkk
> [ 3931.148290] Object ffff8806e04f00f0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b a5  kkkkkkkkkkkkkkk.
> [ 3931.158659] Redzone ffff8806e04f0100: bb bb bb bb bb bb bb bb
> ........
> [ 3931.168351] Padding ffff8806e04f0140: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
> 5a 5a 5a  ZZZZZZZZZZZZZZZZ
> [ 3931.178816] Padding ffff8806e04f0150: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
> 5a 5a 5a  ZZZZZZZZZZZZZZZZ
> [ 3931.189282] Padding ffff8806e04f0160: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
> 5a 5a 5a  ZZZZZZZZZZZZZZZZ
> [ 3931.199747] Padding ffff8806e04f0170: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
> 5a 5a 5a  ZZZZZZZZZZZZZZZZ
> [ 3931.210213] Pid: 25402, comm: fcoethread/4 Tainted: G    B        3.7.0-rc2+ #51
> [ 3931.218452] Call Trace:
> [ 3931.221175]  [<ffffffff810b08cc>] print_trailer+0x126/0x12f
> [ 3931.227381]  [<ffffffff810b0dfe>] check_bytes_and_report+0xb2/0xeb
> [ 3931.234267]  [<ffffffff81088dbb>] ? mempool_alloc_slab+0xf/0x12
> [ 3931.240860]  [<ffffffff810b0eec>] check_object+0xb5/0x1df
> [ 3931.246873]  [<ffffffff81088dbc>] ? mempool_alloc_slab+0x10/0x12
> [ 3931.253565]  [<ffffffff810b1a69>] alloc_debug_processing+0xa4/0x138
> [ 3931.260547]  [<ffffffff810b3150>] T.1679+0x28d/0x2d9
> [ 3931.266075]  [<ffffffff81088dbc>] ? mempool_alloc_slab+0x10/0x12
> [ 3931.272767]  [<ffffffff81088dbc>] ? mempool_alloc_slab+0x10/0x12
> [ 3931.279458]  [<ffffffff810b330c>] kmem_cache_alloc+0x4d/0xa0
> [ 3931.285763]  [<ffffffff81088dbc>] mempool_alloc_slab+0x10/0x12
> [ 3931.292258]  [<ffffffff81088ee6>] mempool_alloc+0x5a/0x139
> [ 3931.298369]  [<ffffffff810b0f72>] ? check_object+0x13b/0x1df
> [ 3931.304674]  [<ffffffffa03d673a>] fc_exch_em_alloc+0x25/0x1f6 [libfc]
> [ 3931.311850]  [<ffffffffa03d6c62>] fc_seq_lookup_recip+0x17c/0x347 [libfc]
> [ 3931.319413]  [<ffffffffa03d6eac>] fc_seq_assign+0x7f/0x9c [libfc]
> [ 3931.326202]  [<ffffffffa067829e>] ft_recv_req+0x76/0x14c [tcm_fc]
> [ 3931.332991]  [<ffffffffa0679f1c>] ft_recv+0x129/0x132 [tcm_fc]
> [ 3931.339490]  [<ffffffffa03db815>] fc_lport_recv_req+0x68/0xc5 [libfc]
> [ 3931.346667]  [<ffffffffa03d8ecd>] fc_exch_recv+0xaed/0xbb0 [libfc]
> [ 3931.353552]  [<ffffffffa06b9cc4>] ? fcoe_percpu_thread_create+0x7b/0x7b
> [fcoe]
> [ 3931.361600]  [<ffffffffa06ba05c>] fcoe_percpu_receive_thread+0x398/0x46f
> [fcoe]
> [ 3931.369744]  [<ffffffffa06b9cc4>] ? fcoe_percpu_thread_create+0x7b/0x7b
> [fcoe]
> [ 3931.377790]  [<ffffffff81043f56>] kthread+0xb0/0xb8
> [ 3931.383224]  [<ffffffff81043ea6>] ? kthread_freezable_should_stop+0x60/0x60
> [ 3931.390979]  [<ffffffff81362c7c>] ret_from_fork+0x7c/0xb0
> [ 3931.396993]  [<ffffffff81043ea6>] ? kthread_freezable_should_stop+0x60/0x60
> [ 3931.404749] FIX libfc_em: Restoring 0xffff8806e04f0004-
> 0xffff8806e04f0004=0x6b
> [ 3931.404749]
> [ 3931.414441] FIX libfc_em: Marking all objects used
> 
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
��.n��������+%������w��{.n�����{������ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux