Re: iSCSI login failure, possible race condition

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Hi Nick,

Apology for the delay. I tried your patch and it does solve the login issue.
Many thanks for your help.

/Maged

--------------------------------------------------
From: "Nicholas A. Bellinger" <nab@xxxxxxxxxxxxxxx>
Sent: Sunday, February 28, 2016 4:15 AM
To: "Maged Mokhtar" <mmokhtar@xxxxxxxxxxxxxxxxxx>
Cc: <target-devel@xxxxxxxxxxxxxxx>
Subject: Re: iSCSI login failure, possible race condition

Hi Maged,

On Mon, 2016-02-08 at 17:43 -0700, Maged Mokhtar wrote:
Hi Nick

Please find patch below:

Apologies for the delayed follow-up.  Comments below.


Solves an issue with frequent "iSCSI Login negotiation failed."  observed
during target discovery from Microsoft client initiator running inside a
virtual machine to a target running in another virtual machine inside the
same host under VMWare.
iscsi_target_sk_data_ready() callback is received too soon before
LOGIN_FLAGS_READY flag is set.
The patch fixes the problem by moving the section that checks the various
LOGIN_FLAGS_* from the iscsi_target_sk_data_ready() callback to the worker
function iscsi_target_do_login_rx() and in case  LOGIN_FLAGS_READY is not
set, sleeps and checks it a second time before exiting.

Maged Mokhtar mmokhtar@xxxxxxxxxxxxxxxxxx


diff -up a/drivers/target/iscsi/iscsi_target_nego.c b/drivers/target/iscsi/iscsi_target_nego.c --- a/drivers/target/iscsi/iscsi_target_nego.c 2016-02-08 18:07:32.282055751 -0500 +++ b/drivers/target/iscsi/iscsi_target_nego.c 2016-02-08 18:28:59.655937903 -0500
@@ -409,34 +409,11 @@ static void iscsi_target_sk_data_ready(s
        bool rc;

pr_debug("Entering iscsi_target_sk_data_ready: conn: %p\n", conn);
-
-       write_lock_bh(&sk->sk_callback_lock);
-       if (!sk->sk_user_data) {
-               write_unlock_bh(&sk->sk_callback_lock);
-               return;
-       }
-       if (!test_bit(LOGIN_FLAGS_READY, &conn->login_flags)) {
-               write_unlock_bh(&sk->sk_callback_lock);
- pr_debug("Got LOGIN_FLAGS_READY=0, conn: %p >>>>\n", conn);
-               return;
-       }
-       if (test_bit(LOGIN_FLAGS_CLOSED, &conn->login_flags)) {
-               write_unlock_bh(&sk->sk_callback_lock);
- pr_debug("Got LOGIN_FLAGS_CLOSED=1, conn: %p >>>>\n", conn);
-               return;
-       }
- if (test_and_set_bit(LOGIN_FLAGS_READ_ACTIVE, &conn->login_flags)) {
-               write_unlock_bh(&sk->sk_callback_lock);
- pr_debug("Got LOGIN_FLAGS_READ_ACTIVE=1, conn: %p >>>>\n", conn);
-               return;
-       }
-
        rc = schedule_delayed_work(&conn->login_work, 0);
        if (!rc) {
pr_debug("iscsi_target_sk_data_ready, schedule_delayed_work"
                         " got false\n");
        }
-       write_unlock_bh(&sk->sk_callback_lock);
 }

 static void iscsi_target_sk_state_change(struct sock *);
@@ -551,6 +528,35 @@ static void iscsi_target_do_login_rx(str

        if (conn->sock) {
                struct sock *sk = conn->sock->sk;
+               write_lock_bh(&sk->sk_callback_lock);
+               if (!sk->sk_user_data) {
+                       write_unlock_bh(&sk->sk_callback_lock);
+                       return;
+               }
+               if (!test_bit(LOGIN_FLAGS_READY, &conn->login_flags)) {
+                       write_unlock_bh(&sk->sk_callback_lock);
+ pr_debug("Got LOGIN_FLAGS_READY=0, conn: %p >>>>\n", conn); + /* retry a second time in case we are called too soon before LOGIN_FLAGS_READY is set */
+                       msleep(100);
+                       write_lock_bh(&sk->sk_callback_lock);
+ if (!test_bit(LOGIN_FLAGS_READY, &conn->login_flags)) {
+                               write_unlock_bh(&sk->sk_callback_lock);
+ pr_debug("Second retry. Got LOGIN_FLAGS_READY=0, conn: %p >>>>\n", conn);
+                               return;
+                       }
+               }
+               if (test_bit(LOGIN_FLAGS_CLOSED, &conn->login_flags)) {
+                       write_unlock_bh(&sk->sk_callback_lock);
+ pr_debug("Got LOGIN_FLAGS_CLOSED=1, conn: %p >>>>\n", conn);
+                       return;
+               }
+ if (test_and_set_bit(LOGIN_FLAGS_READ_ACTIVE, &conn->login_flags)) {
+                       write_unlock_bh(&sk->sk_callback_lock);
+ pr_debug("Got LOGIN_FLAGS_READ_ACTIVE=1, conn: %p >>>>\n", conn);
+                       return;
+               }
+
+               write_unlock_bh(&sk->sk_callback_lock);

                read_lock_bh(&sk->sk_callback_lock);
                state = iscsi_target_sk_state_check(sk);



So pondering this further, an earlier initial LOGIN_FLAGS_READY bit
assignment might work as a simpler patch:

diff --git a/drivers/target/iscsi/iscsi_target_nego.c b/drivers/target/iscsi/iscsi_target_nego.c
index 9fc9117..8720daf 100644
--- a/drivers/target/iscsi/iscsi_target_nego.c
+++ b/drivers/target/iscsi/iscsi_target_nego.c
@@ -1246,16 +1246,16 @@ int iscsi_target_start_negotiation(
{
       int ret;

-       ret = iscsi_target_do_login(conn, login);
-       if (!ret) {
-               if (conn->sock) {
-                       struct sock *sk = conn->sock->sk;
+       if (conn->sock) {
+               struct sock *sk = conn->sock->sk;

-                       write_lock_bh(&sk->sk_callback_lock);
-                       set_bit(LOGIN_FLAGS_READY, &conn->login_flags);
-                       write_unlock_bh(&sk->sk_callback_lock);
-               }
-       } else if (ret < 0) {
+               write_lock_bh(&sk->sk_callback_lock);
+               set_bit(LOGIN_FLAGS_READY, &conn->login_flags);
+               write_unlock_bh(&sk->sk_callback_lock);
+       }
+
+       ret = iscsi_target_do_login(conn, login);
+       if (ret < 0) {
               cancel_delayed_work_sync(&conn->login_work);
               cancel_delayed_work_sync(&conn->login_cleanup_work);
               iscsi_target_restore_sock_callbacks(conn);


Note each iscsi_target_do_login_rx() work_struct callback is still
clearing LOGIN_FLAGS_READ_ACTIVE after iscsi_target_do_login() is
invoked each time to handle subsequent login request PDUs.

So it's unclear if you'll hit a similar problem with clear_bit for
LOGIN_FLAGS_READ_ACTIVE in subsequent login PDUs, or if the above to
kick-off ->sk_data_ready() work_struct processing after initial login
request PDU is received does the trick.

Either way, please give the above a shot on your setup, and let's see
where we end up.

Thanks again,

--nab

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux