On Fri, 2017-06-02 at 16:30 +0000, Bart Van Assche wrote: > On Thu, 2017-06-01 at 20:19 -0700, Nicholas A. Bellinger wrote: > > Here's the updated version to restore original behavior for se_node_acl > > delete, but still avoid the endless loop with the iscsi-target specific > > case where se_node_acl->queue_depth changes. > > > > Care to verify on ib_srpt, or just a report and never confirm..? > > Hello Nic, > > This is what I ran into with commit 4f61e1e687c4 ("target: Avoid > target_shutdown_sessions loop during queue_depth change") merged with kernel > v4.12-rc3. This is a crash I had never seen before. This crash disappears if > I revert commit 4f61e1e687c4 so I think this indicates a bug introduced by > that commit: > Well, commit 4f61e1e687c4 does not change the original behavior to drain the list of active se_node_acl sessions: diff --git a/drivers/target/target_core_tpg.c b/drivers/target/target_core_tpg.c index 3691373..1b2b60e 100644 --- a/drivers/target/target_core_tpg.c +++ b/drivers/target/target_core_tpg.c @@ -336,14 +336,14 @@ struct se_node_acl *core_tpg_add_initiator_node_acl( return acl; } -static void target_shutdown_sessions(struct se_node_acl *acl) +static void target_shutdown_sessions(struct se_node_acl *acl, bool do_restart) { - struct se_session *sess; + struct se_session *sess, *sess_tmp; unsigned long flags; restart: spin_lock_irqsave(&acl->nacl_sess_lock, flags); - list_for_each_entry(sess, &acl->acl_sess_list, sess_acl_list) { + list_for_each_entry_safe(sess, sess_tmp, &acl->acl_sess_list, sess_acl_list) { if (sess->sess_tearing_down) continue; @@ -352,7 +352,11 @@ static void target_shutdown_sessions(struct se_node_acl *acl) if (acl->se_tpg->se_tpg_tfo->close_session) acl->se_tpg->se_tpg_tfo->close_session(sess); - goto restart; + + if (do_restart) + goto restart; + + spin_lock_irqsave(&acl->nacl_sess_lock, flags); } spin_unlock_irqrestore(&acl->nacl_sess_lock, flags); } That is, it's doing the same thing as before in target_shutdown_sessions() walking se_node_acl->acl_sess_list, invoking ->close_session(), and immediately restarting the list walk after each one. How can this mean srpt..? > ib_srpt:srpt_close_ch: ib_srpt 0x0000000000000000e41d2d03000a6d51-1114: queued zerolength write > ib_srpt:srpt_release_channel_work: ib_srpt srpt_release_channel_work: 0x0000000000000000e41d2d03000a6d51-1114; release_done = (null) > ------------[ cut here ]------------ > kernel BUG at drivers/infiniband/ulp/srpt/ib_srpt.c:2770! Btw, looking at v4.12-rc3 there is not a BUG_ON() at line 2770. Perhaps BUG_ON(ch->release_done) at line 2719, which could indicate srpt_close_session() is being called twice... But if it is, then why isn't srpt_close_session() pr_debug shown anywhere in your output..? Can I have a look at the full debug with the missing srpt_close_sessions() messages to see if it's being called twice for the same se_session, and the code changes against v4.12-rc3 you're testing with that account for the ~50 lines offset..? -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html