Re: LIO iSCSI memory usage issue when running read IOs w/ IOMeter.

Benjamin ESTRABAUD <be@xxxxxxxxxx> · Thu, 28 Feb 2013 15:48:17 +0000

On 28/02/13 00:54, Andy Grover wrote:

On 02/26/2013 07:25 AM, Benjamin ESTRABAUD wrote:
* Running iSCSI read IOs over LIO with IO Meter (version 2006.07.27) on
Windows and a queue depth (IOs/target) of 10 or above causes the memory
usage to grow nearly as big and as fast as the read IOs received, with a
degradation of the IO performance proportional to the ammount of extra
memory used (especially visible when using a fast 10GbE link), and
whereby the extra memory used rarely goes over 1 to 3GB, after which it
suddenly goes back up to its original level, at which point the cycle
restarts.

Hi Ben,

Hi Andy!
There were a lot of changes between 3.4 and 3.5, so instead of looking 
at them right away, I started with the .pcap :)

First, it looks like qdepth is 16, not 10 -- we get 16 when counting 
the requests in packets 10, 12, and 14 (1, 12, and 3, respectively). 
256KiB * 16 = 4MiB, so that should roughly be the most memory we're 
tying up, I guess? Less than a gig, for sure.

I agree, but I'm pretty sure they were set to 10, I checked again using 
the .icf file that I generated while IO meter was opened, and it was 
indeed 10. Also, the IOs were actually 1MB IOs. I think what happened 
here was that either IOMeter or the Windows iSCSI Initiator split the 
IOs in smaller chunks.

Next, is the target volume all zeroes? Because I saw some weird stuff:

4401: READ(10) of LBA 17408 len 512. CmdSN 52dc

7922: offset 0x222. Start of first Data-In PDU for 52dc
  (datasn=0, statsn=ff's)
7987: offset 0x17a Start of second Data-In PDU
  (datasn=1, statsn=ff's)
8040: offset 0xf2 Data starts looking random or scrambled or 
something, instead of all-zeroes!
8050: offset 0x106 Start of 3rd Data-In PDU looks valid, data still 
scrambled
  (datasn=2, statsn=ff's)
8083: offset 0x5e start of fourth Data-In PDU, scrambled data
  (datasn=3, statsn=0xec3e1aca)

8455: data goes back to being 00s

The backstore on which the LIO target was created definitely had a good 
chunk of it full of zeros, as I zeroed 1GB on each of the drives before 
creating the RAID on them. So part of the IOs would return mostly zeros, 
when reading the beginning of the device, and then switch to garbage 
(either 0s or garbage that is).
So that's weird. After 2289, Wireshark stops recognizing Data-In 
packets. Annoying.

Strange, not sure why either.
I'll start going through the code changes tomorrow. Aside from the 
weird data coming back, I guess the pcap looks ok. Nothing to account 
for huge memory use yet.

Looks like Nicholas found the problem since you wrote this mail, but 
nevertheless, thanks for your help.

Thanks.

Regards,
Ben.
Regards -- Andy

--
To unsubscribe from this list: send the line "unsubscribe 
target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html