Hi Maged, On Mon, 2016-02-08 at 17:43 -0700, Maged Mokhtar wrote: > Hi Nick > > Please find patch below: Apologies for the delayed follow-up. Comments below. > > Solves an issue with frequent "iSCSI Login negotiation failed." observed > during target discovery from Microsoft client initiator running inside a > virtual machine to a target running in another virtual machine inside the > same host under VMWare. > iscsi_target_sk_data_ready() callback is received too soon before > LOGIN_FLAGS_READY flag is set. > The patch fixes the problem by moving the section that checks the various > LOGIN_FLAGS_* from the iscsi_target_sk_data_ready() callback to the worker > function iscsi_target_do_login_rx() and in case LOGIN_FLAGS_READY is not > set, sleeps and checks it a second time before exiting. > > Maged Mokhtar mmokhtar@xxxxxxxxxxxxxxxxxx > > > diff -up a/drivers/target/iscsi/iscsi_target_nego.c b/drivers/target/iscsi/iscsi_target_nego.c > --- a/drivers/target/iscsi/iscsi_target_nego.c 2016-02-08 18:07:32.282055751 -0500 > +++ b/drivers/target/iscsi/iscsi_target_nego.c 2016-02-08 18:28:59.655937903 -0500 > @@ -409,34 +409,11 @@ static void iscsi_target_sk_data_ready(s > bool rc; > > pr_debug("Entering iscsi_target_sk_data_ready: conn: %p\n", conn); > - > - write_lock_bh(&sk->sk_callback_lock); > - if (!sk->sk_user_data) { > - write_unlock_bh(&sk->sk_callback_lock); > - return; > - } > - if (!test_bit(LOGIN_FLAGS_READY, &conn->login_flags)) { > - write_unlock_bh(&sk->sk_callback_lock); > - pr_debug("Got LOGIN_FLAGS_READY=0, conn: %p >>>>\n", conn); > - return; > - } > - if (test_bit(LOGIN_FLAGS_CLOSED, &conn->login_flags)) { > - write_unlock_bh(&sk->sk_callback_lock); > - pr_debug("Got LOGIN_FLAGS_CLOSED=1, conn: %p >>>>\n", conn); > - return; > - } > - if (test_and_set_bit(LOGIN_FLAGS_READ_ACTIVE, &conn->login_flags)) { > - write_unlock_bh(&sk->sk_callback_lock); > - pr_debug("Got LOGIN_FLAGS_READ_ACTIVE=1, conn: %p >>>>\n", conn); > - return; > - } > - > rc = schedule_delayed_work(&conn->login_work, 0); > if (!rc) { > pr_debug("iscsi_target_sk_data_ready, schedule_delayed_work" > " got false\n"); > } > - write_unlock_bh(&sk->sk_callback_lock); > } > > static void iscsi_target_sk_state_change(struct sock *); > @@ -551,6 +528,35 @@ static void iscsi_target_do_login_rx(str > > if (conn->sock) { > struct sock *sk = conn->sock->sk; > + write_lock_bh(&sk->sk_callback_lock); > + if (!sk->sk_user_data) { > + write_unlock_bh(&sk->sk_callback_lock); > + return; > + } > + if (!test_bit(LOGIN_FLAGS_READY, &conn->login_flags)) { > + write_unlock_bh(&sk->sk_callback_lock); > + pr_debug("Got LOGIN_FLAGS_READY=0, conn: %p >>>>\n", conn); > + /* retry a second time in case we are called too soon before LOGIN_FLAGS_READY is set */ > + msleep(100); > + write_lock_bh(&sk->sk_callback_lock); > + if (!test_bit(LOGIN_FLAGS_READY, &conn->login_flags)) { > + write_unlock_bh(&sk->sk_callback_lock); > + pr_debug("Second retry. Got LOGIN_FLAGS_READY=0, conn: %p >>>>\n", conn); > + return; > + } > + } > + if (test_bit(LOGIN_FLAGS_CLOSED, &conn->login_flags)) { > + write_unlock_bh(&sk->sk_callback_lock); > + pr_debug("Got LOGIN_FLAGS_CLOSED=1, conn: %p >>>>\n", conn); > + return; > + } > + if (test_and_set_bit(LOGIN_FLAGS_READ_ACTIVE, &conn->login_flags)) { > + write_unlock_bh(&sk->sk_callback_lock); > + pr_debug("Got LOGIN_FLAGS_READ_ACTIVE=1, conn: %p >>>>\n", conn); > + return; > + } > + > + write_unlock_bh(&sk->sk_callback_lock); > > read_lock_bh(&sk->sk_callback_lock); > state = iscsi_target_sk_state_check(sk); > > So pondering this further, an earlier initial LOGIN_FLAGS_READY bit assignment might work as a simpler patch: diff --git a/drivers/target/iscsi/iscsi_target_nego.c b/drivers/target/iscsi/iscsi_target_nego.c index 9fc9117..8720daf 100644 --- a/drivers/target/iscsi/iscsi_target_nego.c +++ b/drivers/target/iscsi/iscsi_target_nego.c @@ -1246,16 +1246,16 @@ int iscsi_target_start_negotiation( { int ret; - ret = iscsi_target_do_login(conn, login); - if (!ret) { - if (conn->sock) { - struct sock *sk = conn->sock->sk; + if (conn->sock) { + struct sock *sk = conn->sock->sk; - write_lock_bh(&sk->sk_callback_lock); - set_bit(LOGIN_FLAGS_READY, &conn->login_flags); - write_unlock_bh(&sk->sk_callback_lock); - } - } else if (ret < 0) { + write_lock_bh(&sk->sk_callback_lock); + set_bit(LOGIN_FLAGS_READY, &conn->login_flags); + write_unlock_bh(&sk->sk_callback_lock); + } + + ret = iscsi_target_do_login(conn, login); + if (ret < 0) { cancel_delayed_work_sync(&conn->login_work); cancel_delayed_work_sync(&conn->login_cleanup_work); iscsi_target_restore_sock_callbacks(conn); Note each iscsi_target_do_login_rx() work_struct callback is still clearing LOGIN_FLAGS_READ_ACTIVE after iscsi_target_do_login() is invoked each time to handle subsequent login request PDUs. So it's unclear if you'll hit a similar problem with clear_bit for LOGIN_FLAGS_READ_ACTIVE in subsequent login PDUs, or if the above to kick-off ->sk_data_ready() work_struct processing after initial login request PDU is received does the trick. Either way, please give the above a shot on your setup, and let's see where we end up. Thanks again, --nab -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html