Re: LIO iSCSI memory usage issue when running read IOs w/ IOMeter.

Benjamin ESTRABAUD <be@xxxxxxxxxx> · Thu, 28 Feb 2013 15:52:20 +0000

On 28/02/13 02:27, Nicholas A. Bellinger wrote:
On Tue, 2013-02-26 at 15:25 +0000, Benjamin ESTRABAUD wrote:
Hi Nicholas, all,

Here is a recap of our issue:

* Running iSCSI read IOs over LIO with IO Meter (version 2006.07.27) on
Windows and a queue depth (IOs/target) of 10 or above causes the memory
usage to grow nearly as big and as fast as the read IOs received, with a
degradation of the IO performance proportional to the ammount of extra
memory used (especially visible when using a fast 10GbE link), and
whereby the extra memory used rarely goes over 1 to 3GB, after which it
suddenly goes back up to its original level, at which point the cycle
restarts.

This is probably not very clear, so here's a bit more details:

Free memory 3.8GB.
Running read IOs over a GbE link at 100MB/sec: free memory decreases by
~100MB per second.
seconds later, free memory reaches 2.8GB
next second, the free memory has gone back to 3.8GB (recovered).
Restart above cycle continuously

Some more detailed informations about the conditions to reproduce the issue:

* The issue only happens on 3.5+ kernels. Works fine on 3.4 kernels.
Benjamin, first thank you for the detailed bug report.
Hi Nicholas, no problem!

So your actually hitting a TX thread regression bug in iscsi-target code
causing the per connection immediate queue to be starved of CPU, and
thus preventing StatSN acknowledged iscsi_cmd descriptors within the per
connection immediate queue to release memory back into lio_qr_cache +
lio_cmd_cache slabs.
That was some fast debug! :) Thanks a lot for finding this, we had ran 
out of options and were preparing to find the offending commit by 
bisecting all of them between 3.4 and 3.5.
As you're observed, this is happening when a heavy per connection read
workload causes the response queue to run for long periods of time
without checking immediate queue status.
That makes sense.
So the v3.4 code checks for a break from execution after each
MaxBurstLength sized DataIN Sequence is sent, and this logic managed to
get broken in the following upstream commit:
Thanks a lot for your help in finding the issue, and thanks for the 
patch you submitted next!

We're going to apply it and retest.

Thanks.

Regards,
Ben.
commit 6f3c0e69a9c20441bdc6d3b2d18b83b244384ec6
Author: Andy Grover <agrover@xxxxxxxxxx>
Date:   Tue Apr 3 15:51:09 2012 -0700

     target/iscsi: Refactor target_tx_thread immediate+response queue loops

(grover CC'ed)

The patch below fixes the regression by following pre v3.5 logic to
check the TX immediate queue bit + break after each DataIN sequence has
been sent.

This fixes the bug on my side with v3.8-rc7 code, and is getting pushed
to target-pending/for-next now.

Please verify on your end.

Thank you,

--nab

diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c
index 23a98e6..af77396 100644
--- a/drivers/target/iscsi/iscsi_target.c
+++ b/drivers/target/iscsi/iscsi_target.c
@@ -3583,6 +3583,10 @@ check_rsp_state:
                                 spin_lock_bh(&cmd->istate_lock);
                                 cmd->i_state = ISTATE_SENT_STATUS;
                                 spin_unlock_bh(&cmd->istate_lock);
+
+                               if (atomic_read(&conn->check_immediate_queue))
+                                       return 1;
+
                                 continue;
                         } else if (ret == 2) {
                                 /* Still must send status,
@@ -3672,7 +3676,7 @@ check_rsp_state:
                 }
  
                 if (atomic_read(&conn->check_immediate_queue))
-                       break;
+                       return 1;
         }
  
         return 0;
@@ -3716,12 +3720,15 @@ restart:
                      signal_pending(current))
                         goto transport_err;
  
+get_immediate:
                 ret = handle_immediate_queue(conn);
                 if (ret < 0)
                         goto transport_err;
  
                 ret = handle_response_queue(conn);
-               if (ret == -EAGAIN)
+               if (ret == 1)
+                       goto get_immediate;
+               else if (ret == -EAGAIN)
                         goto restart;
                 else if (ret < 0)
                         goto transport_err;






--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html