Hi Himanshu, (Adding target-devel CC' again) On Mon, 2016-03-14 at 21:25 +0000, Himanshu Madhani wrote: > Hi Nic, > > > Running latest upstream kernel 4.5.0-rc7 + your > patch 5643d9c6664beaa171c88dd0a4e99a7420ac50cb (“target: Drop > incorrect ABORT_TASK put for completed commands”) > > > I ran into following stack trace with my script to trigger > host/bus/device reset in loop after 14 hours of runtime. > > > [52431.733950] qla2xxx [0000:06:00.0]-385e:13: Building additional status packet 0xffff88041d452280. > [52431.734055] qla2xxx [0000:06:00.0]-385e:13: Building additional status packet 0xffff88041d452340. > [52557.733014] INFO: task kworker/10:23:11750 blocked for more than 120 seconds. > [52557.733021] Tainted: G OE 4.5.0-rc7+ #47 > [52557.733022] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [52557.733024] kworker/10:23 D ffff8800530ef7d8 0 11750 2 0x00000080 > [52557.733050] Workqueue: events qlt_free_session_done [qla2xxx] > [52557.733053] ffff8800530ef7d8 0000000000000001 ffff8804276d6870 ffff88042c4b0540 > [52557.733057] ffff8800b57546c0 0000000000000000 ffff88025f7af740 ffff8800530ef778 > [52557.733060] ffffffff812ea754 ffff88043f9bf850 0000000000000000 ffff88043f9bf850 > [52557.733064] Call Trace: > [52557.733072] [<ffffffff812ea754>] ? queue_unplugged+0x84/0x190 > [52557.733079] [<ffffffff810a6560>] ? enqueue_sleeper+0xf0/0x580 > [52557.733084] [<ffffffff810bf3ad>] ? trace_hardirqs_on+0xd/0x10 > [52557.733091] [<ffffffff8168d887>] schedule+0x47/0xc0 > [52557.733094] [<ffffffff81691df0>] schedule_timeout+0x1f0/0x300 > [52557.733096] [<ffffffff810bc73e>] ? __lock_acquired+0x3be/0x400 > [52557.733099] [<ffffffff8168e992>] ? wait_for_completion+0xe2/0x120 > [52557.733101] [<ffffffff810bc73e>] ? __lock_acquired+0x3be/0x400 > [52557.733104] [<ffffffff8168e99a>] wait_for_completion+0xea/0x120 > [52557.733109] [<ffffffff8109f510>] ? try_to_wake_up+0x410/0x410 > [52557.733133] [<ffffffffa05f2e1d>] target_wait_for_sess_cmds+0x4d/0x1c0 [target_core_mod] > [52557.733141] [<ffffffffa0676f80>] ? qla2xxx_wake_dpc+0x30/0x40 [qla2xxx] > [52557.733148] [<ffffffffa0676fe8>] ? qla2x00_post_work+0x58/0x70 [qla2xxx] > [52557.733152] [<ffffffffa072b109>] tcm_qla2xxx_free_session+0x49/0x90 [tcm_qla2xxx] > [52557.733161] [<ffffffffa06d4009>] qlt_free_session_done+0xf9/0x3d0 [qla2xxx] > [52557.733164] [<ffffffff810bc73e>] ? __lock_acquired+0x3be/0x400 > [52557.733169] [<ffffffff810865e1>] process_one_work+0x231/0x760 > [52557.733172] [<ffffffff8108653a>] ? process_one_work+0x18a/0x760 > [52557.733174] [<ffffffff810bc73e>] ? __lock_acquired+0x3be/0x400 > [52557.733177] [<ffffffff81086d13>] ? worker_thread+0x203/0x530 > [52557.733180] [<ffffffff81086c7d>] worker_thread+0x16d/0x530 > [52557.733183] [<ffffffff8109f522>] ? default_wake_function+0x12/0x20 > [52557.733185] [<ffffffff810b1fc6>] ? __wake_up_common+0x56/0x90 > [52557.733187] [<ffffffff81086b10>] ? process_one_work+0x760/0x760 > [52557.733190] [<ffffffff8168d887>] ? schedule+0x47/0xc0 > [52557.733192] [<ffffffff81086b10>] ? process_one_work+0x760/0x760 > [52557.733195] [<ffffffff8108ccff>] kthread+0xef/0x110 > [52557.733198] [<ffffffff8109508d>] ? finish_task_switch+0x8d/0x230 > [52557.733201] [<ffffffff810970be>] ? schedule_tail+0x1e/0xd0 > [52557.733203] [<ffffffff8108cc10>] ? __init_kthread_worker+0x70/0x70 > [52557.733205] [<ffffffff81693dbf>] ret_from_fork+0x3f/0x70 > [52557.733208] [<ffffffff8108cc10>] ? __init_kthread_worker+0x70/0x70 > [52557.733209] INFO: lockdep is turned off. > [52557.733216] Sending NMI to all CPUs: > [52557.736028] NMI backtrace for cpu 0 > > > Analyzing src in target_wait_for_sess_cmds we made following change > to and test has been running for 48+ hours > Thank you for tracking this down. > > diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c > index 867bc6d..43d8b42 100644 > --- a/drivers/target/target_core_transport.c > +++ b/drivers/target/target_core_transport.c > @@ -2596,8 +2596,6 @@ void target_wait_for_sess_cmds(struct se_session *se_sess) > > list_for_each_entry_safe(se_cmd, tmp_cmd, > &se_sess->sess_wait_list, se_cmd_list) { > - list_del_init(&se_cmd->se_cmd_list); > - > pr_debug("Waiting for se_cmd: %p t_state: %d, fabric state:" > " %d\n", se_cmd, se_cmd->t_state, > se_cmd->se_tfo->get_cmd_state(se_cmd)); > (END) > > > Let me know if this fix looks okay to you. > > Applying the following patch with your authorship + stable CC' to target-pending/for-next. Thank you, --nab >From 484dfe2e26f7c6c1ab463926de4cef5f036043a9 Mon Sep 17 00:00:00 2001 From: Himanshu Madhani <himanshu.madhani@xxxxxxxxxx> Date: Mon, 14 Mar 2016 22:47:37 -0700 Subject: [PATCH] target: Fix target_release_cmd_kref shutdown comp leak This patch fixes an active I/O shutdown bug for fabric drivers using target_wait_for_sess_cmds(), where se_cmd descriptor shutdown would result in hung tasks waiting indefinitely for se_cmd->cmd_wait_comp to complete(). To address this bug, drop the incorrect list_del_init() usage in target_wait_for_sess_cmds() and always complete() during se_cmd target_release_cmd_kref() put, in order to let caller invoke the final fabric release callback into se_cmd->se_tfo->release_cmd() code. Reported-by: Himanshu Madhani <himanshu.madhani@xxxxxxxxxx> Tested-by: Himanshu Madhani <himanshu.madhani@xxxxxxxxxx> Signed-off-by: Himanshu Madhani <himanshu.madhani@xxxxxxxxxx> Cc: stable@xxxxxxxxxxxxxxx Signed-off-by: Nicholas Bellinger <nab@xxxxxxxxxxxxxxx> --- drivers/target/target_core_transport.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c index df01997..734c79e 100644 --- a/drivers/target/target_core_transport.c +++ b/drivers/target/target_core_transport.c @@ -2669,8 +2669,6 @@ void target_wait_for_sess_cmds(struct se_session *se_sess) list_for_each_entry_safe(se_cmd, tmp_cmd, &se_sess->sess_wait_list, se_cmd_list) { - list_del_init(&se_cmd->se_cmd_list); - pr_debug("Waiting for se_cmd: %p t_state: %d, fabric state:" " %d\n", se_cmd, se_cmd->t_state, se_cmd->se_tfo->get_cmd_state(se_cmd)); -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html