Kernel oops during iscsi target shutdown

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

my iscsi stress tests triggered a bug during target shutdown in latest stable
kernels (3.7.1 and earlier). Turning off an iscsi target with active connections
during heavy random write i/o sometimes leads to kernel oopses on different
places, hung kernel tasks and other symptoms of memory corruption. This
calltrace best shows what happens:

BUG: unable to handle kernel NULL pointer dereference at (null)
Entering kdb (current=0xffff88022693d9d0, pid 15) on processor 1 Oops: (null)
due to oops @ 0xffffffff812d6bb6
Pid: 15, comm: kworker/1:0 Not tainted 3.7.1-00004-ga57fbc7-dirty #23 Dell Inc. PowerEdge R510/0DPRKF
RIP: 0010:[<ffffffff812d6bb6>]  [<ffffffff812d6bb6>] __list_add+0x26/0xd0
Call Trace:
[<ffffffffa00952ba>] iscsit_add_cmd_to_response_queue+0x9a/0x100 [iscsi_target_mod]
[<ffffffffa00a20f6>] lio_queue_status+0x26/0x30 [iscsi_target_mod]
[<ffffffffa003d818>] target_complete_ok_work+0x198/0x200 [target_core_mod]
[<ffffffff81056913>] process_one_work+0x133/0x410
[<ffffffffa003d680>] ? target_qf_do_work+0x2a0/0x2a0 [target_core_mod]
[<ffffffff81057800>] worker_thread+0x160/0x3c0
[<ffffffff810576a0>] ? manage_workers+0x280/0x280
[<ffffffff8105cfc6>] kthread+0xc6/0xd0
[<ffffffff8105cf00>] ? flush_kthread_worker+0xb0/0xb0
[<ffffffff8164c02c>] ret_from_fork+0x7c/0xb0
[<ffffffff8105cf00>] ? flush_kthread_worker+0xb0/0xb0

I put few printks to the code and found that iscsit_add_cmd_to_response_queue
was called _after_ iscsit_close_connection. So the iscsi termination logic seems
to be broken in the way that iscsi connection is closed and kfree'd while there are
still pending iscsi_cmds somewhere in the target referencing the deallocated
connection. Below is a dmesg snippet where iscsi_conn with address
ffff880224924800 is kfree'd (time 432.338664) but after successful termination
of the target, two outstanding cmds use this connection address in
iscsit_add_cmd_to_response_queue (time 432.479821).

Dec 22 21:35:16 goose2 kernel: [  432.338662] Moving to TARG_CONN_STATE_FREE.
Dec 22 21:35:16 goose2 kernel: [  432.338664] MARTIN: freeing connection ffff880224924800.
Dec 22 21:35:16 goose2 kernel: [  432.338665] Decremented iSCSI connection count to 0 from node: iqn.1998-01.com.vmware:hv-esx8
Dec 22 21:35:16 goose2 kernel: [  432.338666] Moving to TARG_SESS_STATE_FAILED.
Dec 22 21:35:16 goose2 kernel: [  432.338671] TARGET_CORE[iSCSI]: Deregistered fabric_sess
Dec 22 21:35:16 goose2 kernel: [  432.338673] Moving to TARG_SESS_STATE_FREE.
Dec 22 21:35:16 goose2 kernel: [  432.338674] Released iSCSI session from node: iqn.1998-01.com.vmware:hv-esx8
Dec 22 21:35:16 goose2 kernel: [  432.338675] Decremented number of active iSCSI Sessions on iSCSI TPG: 1 to 0
Dec 22 21:35:16 goose2 kernel: [  432.338677] Released 1 iSCSI Session(s) from Target Portal Group: 1
Dec 22 21:35:16 goose2 kernel: [  432.338678] iSCSI_TPG[1] - Disabled iSCSI Target Portal Group
Dec 22 21:35:16 goose2 kernel: [  432.338717] LIO_Target_ConfigFS: DEREGISTER -> iqn.2011-07.com.zoner:inetstore.vmw0.san2 TPGT: 1 PORTAL: 192.168.224.10:3260
Dec 22 21:35:16 goose2 kernel: [  432.338720] CORE[iqn.2011-07.com.zoner:inetstore.vmw0.san2] - Removed Network Portal: 192.168.224.10:3260,1 on TCP
Dec 22 21:35:16 goose2 kernel: [  432.338788] CORE[0] - Removed Network Portal: 192.168.224.10:3260 on TCP
Dec 22 21:35:16 goose2 kernel: [  432.338789] LIO_Target_ConfigFS: delnpfromtpg done!
Dec 22 21:35:16 goose2 kernel: [  432.338838] iSCSI_TPG[1]_LUN[2] - Removed ACL for InitiatorNode: iqn.1998-01.com.vmware:hv-esx9 Mapped LUN: 2
Dec 22 21:35:16 goose2 kernel: [  432.338857] iSCSI_TPG[1] - Freeing ACL for iSCSI InitiatorNode: iqn.1998-01.com.vmware:hv-esx9 Mapped LUN: 2
Dec 22 21:35:16 goose2 kernel: [  432.338877] iSCSI_TPG[1]_LUN[1] - Removed ACL for InitiatorNode: iqn.1998-01.com.vmware:hv-esx9 Mapped LUN: 1
Dec 22 21:35:16 goose2 kernel: [  432.338894] iSCSI_TPG[1] - Freeing ACL for iSCSI InitiatorNode: iqn.1998-01.com.vmware:hv-esx9 Mapped LUN: 1
Dec 22 21:35:16 goose2 kernel: [  432.338933] iSCSI_TPG[1] - Deleted ACL with TCQ Depth: 64 for iSCSI Initiator Node: iqn.1998-01.com.vmware:hv-esx9
Dec 22 21:35:16 goose2 kernel: [  432.338965] iSCSI_TPG[1]_LUN[2] - Removed ACL for InitiatorNode: iqn.1998-01.com.vmware:hv-esx8 Mapped LUN: 2
Dec 22 21:35:16 goose2 kernel: [  432.338983] iSCSI_TPG[1] - Freeing ACL for iSCSI InitiatorNode: iqn.1998-01.com.vmware:hv-esx8 Mapped LUN: 2
Dec 22 21:35:16 goose2 kernel: [  432.339003] iSCSI_TPG[1]_LUN[1] - Removed ACL for InitiatorNode: iqn.1998-01.com.vmware:hv-esx8 Mapped LUN: 1
Dec 22 21:35:16 goose2 kernel: [  432.339024] iSCSI_TPG[1] - Freeing ACL for iSCSI InitiatorNode: iqn.1998-01.com.vmware:hv-esx8 Mapped LUN: 1
Dec 22 21:35:16 goose2 kernel: [  432.339063] iSCSI_TPG[1] - Deleted ACL with TCQ Depth: 64 for iSCSI Initiator Node: iqn.1998-01.com.vmware:hv-esx8
Dec 22 21:35:16 goose2 kernel: [  432.339184] iSCSI_TPG[1]_LUN[2] - Deactivated iSCSI Logical Unit from device object
Dec 22 21:35:16 goose2 kernel: [  432.339313] iSCSI_TPG[1]_LUN[1] - Deactivated iSCSI Logical Unit from device object
Dec 22 21:35:16 goose2 kernel: [  432.339353] LIO_Target_ConfigFS: DEREGISTER -> Releasing TPG
Dec 22 21:35:16 goose2 kernel: [  432.339355] Released 0 iSCSI Session(s) from Target Portal Group: 1
Dec 22 21:35:16 goose2 kernel: [  432.339366] TARGET_CORE[Normal]: Deallocating iSCSI struct se_portal_group for endpoint: iqn.2011-07.com.zoner:inetstore.vmw0.san2 Portal Tag 1
Dec 22 21:35:16 goose2 kernel: [  432.339398] CORE[iqn.2011-07.com.zoner:inetstore.vmw0.san2]_TPG[1] - Deleted iSCSI Target Portal Group
...
Dec 22 21:35:16 goose2 lrmd: [3117]: info: operation stop[153] on tpgvmw0 for client 3120: pid 8900 exited with return code 0
...
Dec 22 21:35:17 goose2 kernel: [  432.479815] Incremented dev->dev_cur_ordered_id: 42204 for SIMPLE: 42204
Dec 22 21:35:17 goose2 kernel: [  432.479821] MARTIN: Adding ffff880210641a40 cmd to conn ffff880224924800 as qreq ffff88020fca4330
Dec 22 21:35:17 goose2 kernel: [  432.483560] Incremented dev->dev_cur_ordered_id: 42205 for SIMPLE: 42241
Dec 22 21:35:17 goose2 kernel: [  432.483564] MARTIN: Adding ffff88020e39eac0 cmd to conn ffff880224924800 as qreq ffff88020fca4230
...
Dec 22 21:35:17 goose2 kernel: [  432.510296] LIO_Target_ConfigFS: DEREGISTER -> iqn.2011-07.com.zoner:inetstore.vmw0.san2
Dec 22 21:35:17 goose2 kernel: [  432.510302] CORE[0] - Deleted iSCSI Target IQN: iqn.2011-07.com.zoner:inetstore.vmw0.san2
Dec 22 21:35:17 goose2 kernel: [  432.510338] LIO_Target_ConfigFS: DEREGISTER -> iqn.2011-07.com.zoner:inetstore.vmw0.san1
Dec 22 21:35:17 goose2 kernel: [  432.510340] CORE[0] - Deleted iSCSI Target IQN: iqn.2011-07.com.zoner:inetstore.vmw0.san1
Dec 22 21:35:17 goose2 kernel: [  432.510485] Target_Core_ConfigFS: Calling se_free_virtual_device() for se_dev_ptr: ffff880211081c00
Dec 22 21:35:17 goose2 kernel: [  432.510527] Target_Core_ConfigFS: Deallocating se_subsystem_dev_t: ffff8802102d1000
Dec 22 21:35:17 goose2 kernel: [  432.510604] Target_Core_ConfigFS: Calling se_free_virtual_device() for se_dev_ptr: ffff88020fc71000
Dec 22 21:35:17 goose2 kernel: [  432.510666] Target_Core_ConfigFS: Deallocating se_subsystem_dev_t: ffff8802103c3000
Dec 22 21:35:17 goose2 kernel: [  432.510690] CORE_HBA[12] - Detached HBA from Generic Target Core
Dec 22 21:35:17 goose2 kernel: [  432.510708] CORE_HBA[11] - Detached HBA from Generic Target Core
Dec 22 21:35:17 goose2 kernel: [  432.510729] CORE_HBA[10] - Detached HBA from Generic Target Core
...
Dec 22 21:35:44 goose2 kernel: [  460.265788] BUG: soft lockup - CPU#4 stuck for 22s! [prmount:8994]

Tested with 3.7.1 stable kernel, 3.6.x looks to have the same problem.
I skipped v3.4 and v3.5 so I don't know if the bug was introduced in v3.6
or earlier.

Martin

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux