Hello, I've pulled the latest lio-4.1 branch (head 2a940ce682163c) and there are new NULL dereferences introduced since 2.6.39 that I can quite regularly hit with my iscsi cluster setup. They occur when removing busy iscsi targets during I/O activity from initiators. First oops: [ 345.869604] BUG: unable to handle kernel NULL pointer dereference at 0000000000000168 [ 345.869866] IP: [<ffffffffa00871b0>] iscsit_add_cmd_to_response_queue+0xa0/0xe0 [iscsi_target_mod] [ 345.870099] PGD 31ab98067 PUD 31aba6067 PMD 0 [ 345.870335] Oops: 0000 [#1] SMP [ 345.870527] CPU 0 [ 345.870573] Modules linked in: target_core_iblock target_core_file target_core_pscsi target_core_stgt scsi_tgt iscsi_target_mod target_core_mod bonding [ 345.871239] [ 345.871342] Pid: 5983, comm: LIO_iblock Not tainted 3.0.0-rc6+ #58 Dell Inc. PowerEdge R510/00HDP0 [ 345.871644] RIP: 0010:[<ffffffffa00871b0>] [<ffffffffa00871b0>] iscsit_add_cmd_to_response_queue+0xa0/0xe0 [iscsi_target_mod] [ 345.871867] RSP: 0018:ffff88031979dda0 EFLAGS: 00010246 [ 345.871976] RAX: 0000000000000000 RBX: ffff88031d504c00 RCX: 0000000000000000 [ 345.872090] RDX: 0000000000000028 RSI: 0000000000000000 RDI: ffffffffa00871a9 [ 345.872206] RBP: ffff88031979ddd0 R08: 0000000000000000 R09: ffff880319714ef0 [ 345.872319] R10: dead000000200200 R11: 000000000000004d R12: ffff880321d0a080 [ 345.872320] R13: ffff88031d504f28 R14: ffff880319714ef0 R15: ffff880319714f00 [ 345.872322] FS: 0000000000000000(0000) GS:ffff88032f200000(0000) knlGS:0000000000000000 [ 345.872324] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 345.872325] CR2: 0000000000000168 CR3: 0000000319b2e000 CR4: 00000000000006f0 [ 345.872327] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 345.872329] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 345.872331] Process LIO_iblock (pid: 5983, threadinfo ffff88031979c000, task ffff8803196b4670) [ 345.872332] Stack: [ 345.872333] ffff88031979dde0 ffff880321d0a340 ffff880320272000 ffff880320272230 [ 345.872335] ffff880320272180 0000000000000000 ffff88031979dde0 ffffffffa00926e6 [ 345.872338] ffff88031979ded0 ffffffffa003cdfa ffff88031979de00 ffff88031979de70 [ 345.872340] Call Trace: [ 345.872350] [<ffffffffa00926e6>] lio_queue_data_in+0x26/0x30 [iscsi_target_mod] [ 345.872370] [<ffffffffa003cdfa>] transport_processing_thread+0x70a/0xdc0 [target_core_mod] [ 345.872376] [<ffffffff81033776>] ? finish_task_switch+0x66/0xd0 [ 345.872381] [<ffffffff8153c5b1>] ? schedule+0x271/0x6e0 [ 345.872386] [<ffffffff8105bef0>] ? wake_up_bit+0x40/0x40 [ 345.872393] [<ffffffffa003c6f0>] ? transport_handle_cdb_direct+0x70/0x70 [target_core_mod] [ 345.872395] [<ffffffff8105ba16>] kthread+0x96/0xa0 [ 345.872402] [<ffffffff815403d4>] kernel_thread_helper+0x4/0x10 [ 345.872404] [<ffffffff8105b980>] ? __init_kthread_worker+0x40/0x40 [ 345.872406] [<ffffffff815403d0>] ? gs_change+0xb/0xb [ 345.872407] Code: 00 4c 89 bb b0 03 00 00 49 89 56 10 49 89 46 18 4c 89 38 f0 41 ff 84 24 dc 00 00 00 4c 89 ef e8 c7 78 4b e1 48 8b 83 e8 03 00 00 [ 345.872417] 8b b8 68 01 00 00 e8 f4 21 fb e0 48 8b 5d d8 4c 8b 65 e0 4c [ 345.872421] RIP [<ffffffffa00871b0>] iscsit_add_cmd_to_response_queue+0xa0/0xe0 [iscsi_target_mod] [ 345.872426] RSP <ffff88031979dda0> [ 345.872427] CR2: 0000000000000168 [ 345.872454] ---[ end trace e7eac49507444d66 ]--- Gdb output for iscsit_add_cmd_to_response_queue+0xa0: (gdb) list *(iscsit_add_cmd_to_response_queue+0xa0) 0x131e0 is in iscsit_add_cmd_to_response_queue (drivers/target/iscsi/iscsi_target_util.c:721). 716 spin_lock_bh(&conn->response_queue_lock); 717 list_add_tail(&qr->qr_list, &conn->response_queue_list); 718 atomic_inc(&cmd->response_queue_count); 719 spin_unlock_bh(&conn->response_queue_lock); 720 721 wake_up_process(conn->thread_set->tx_thread); 722 } Second oops: [ 346.012397] Target_Core_ConfigFS: Calling se_free_virtual_device() for se_dev_ptr: ffff880320272000 [ 346.012405] BUG: unable to handle kernel NULL pointer dereference at (null) [ 346.012407] IP: [<ffffffff810629fa>] exit_creds+0x1a/0x90 [ 346.012417] PGD 30abe4067 PUD 309fc0067 PMD 0 [ 346.012419] Oops: 0000 [#2] SMP [ 346.012422] CPU 1 [ 346.012424] Modules linked in: target_core_iblock target_core_file target_core_pscsi target_core_stgt scsi_tgt iscsi_target_mod target_core_mod bonding [ 346.012430] [ 346.012432] Pid: 6434, comm: liofx Tainted: G D 3.0.0-rc6+ #58 Dell Inc. PowerEdge R510/00HDP0 [ 346.012435] RIP: 0010:[<ffffffff810629fa>] [<ffffffff810629fa>] exit_creds+0x1a/0x90 [ 346.012438] RSP: 0018:ffff8803197dbcb8 EFLAGS: 00010296 [ 346.012439] RAX: 0000000000000000 RBX: ffff8803196b4670 RCX: 00000000000001f2 [ 346.012441] RDX: 0000000000000006 RSI: ffff880309d8ac80 RDI: 0000000000000000 [ 346.012442] RBP: ffff8803197dbcc8 R08: 0000000000000020 R09: 0000000000000005 [ 346.012443] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 [ 346.012445] R13: ffffffffa00d2440 R14: 0000000000000000 R15: 0000000000000000 [ 346.012446] FS: 00007f0cdd7eb720(0000) GS:ffff88032f220000(0000) knlGS:0000000000000000 [ 346.012448] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 346.012449] CR2: 0000000000000000 CR3: 00000003197e7000 CR4: 00000000000006e0 [ 346.012451] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 346.012453] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 346.012454] Process liofx (pid: 6434, threadinfo ffff8803197da000, task ffff880321cfb990) [ 346.012456] Stack: [ 346.012456] ffff8803200c3640 ffff8803196b4670 ffff8803197dbce8 ffffffff8103b8f5 [ 346.012459] ffff8803197dbce8 ffff8803196b4670 ffff8803197dbd08 ffffffff8105bd58 [ 346.012462] ffff880320272000 ffff8803200c3600 ffff8803197dbd38 ffffffffa002714b [ 346.012464] Call Trace: [ 346.012472] [<ffffffff8103b8f5>] __put_task_struct+0x35/0xa0 [ 346.012477] [<ffffffff8105bd58>] kthread_stop+0x78/0xe0 [ 346.012496] [<ffffffffa002714b>] se_release_device_for_hba+0x3b/0xe0 [target_core_mod] [ 346.012502] [<ffffffffa002721c>] se_free_virtual_device+0x2c/0x40 [target_core_mod] [ 346.012507] [<ffffffffa0024fdd>] target_core_dev_release+0x6d/0xc0 [target_core_mod] [ 346.012512] [<ffffffff81157810>] ? config_item_put+0x20/0x20 [ 346.012514] [<ffffffff81157875>] config_item_release+0x65/0xa0 [ 346.012517] [<ffffffff81157810>] ? config_item_put+0x20/0x20 [ 346.012521] [<ffffffff81282247>] kref_put+0x37/0x70 [ 346.012523] [<ffffffff81157809>] config_item_put+0x19/0x20 [ 346.012525] [<ffffffff8115632d>] configfs_rmdir+0x18d/0x240 [ 346.012529] [<ffffffff810fa208>] vfs_rmdir+0x88/0xc0 [ 346.012531] [<ffffffff810fe1db>] do_rmdir+0x10b/0x120 [ 346.012534] [<ffffffff810ef9ed>] ? vfs_write+0x12d/0x180 [ 346.012536] [<ffffffff810efb2c>] ? sys_write+0x4c/0x90 [ 346.012538] [<ffffffff810fe241>] sys_rmdir+0x11/0x20 [ 346.012546] [<ffffffff8153f37b>] system_call_fastpath+0x16/0x1b Gdb output for se_release_device_for_hba+0x3b: (gdb) list *(se_release_device_for_hba+0x3b) 0x714b is in se_release_device_for_hba (drivers/target/target_core_device.c:734). 729 (dev->dev_status & TRANSPORT_DEVICE_OFFLINE_DEACTIVATED)) 730 se_dev_stop(dev); 731 732 if (dev->dev_ptr) { 733 kthread_stop(dev->process_thread); 734 if (dev->transport->free_device) 735 dev->transport->free_device(dev->dev_ptr); 736 } 737 738 spin_lock(&hba->device_lock); Second oops usually follows close after the first one. Also note the similarity of the first oops with the bug fixed in 70e69281e0616d18414f65a10d31e80efb91a51d ("iscsi-target: Move conn->thread_set = NULL assignment after iscsi_release_thread_set"). Any ideas? Martin -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html