LIO iSCSI memory usage issue when running read IOs w/ IOMeter.

Benjamin ESTRABAUD <be@xxxxxxxxxx> · Tue, 26 Feb 2013 15:25:16 +0000

Hi Nicholas, all,

Here is a recap of our issue:

* Running iSCSI read IOs over LIO with IO Meter (version 2006.07.27) on 
Windows and a queue depth (IOs/target) of 10 or above causes the memory 
usage to grow nearly as big and as fast as the read IOs received, with a 
degradation of the IO performance proportional to the ammount of extra 
memory used (especially visible when using a fast 10GbE link), and 
whereby the extra memory used rarely goes over 1 to 3GB, after which it 
suddenly goes back up to its original level, at which point the cycle 
restarts.

This is probably not very clear, so here's a bit more details:

Free memory 3.8GB.
Running read IOs over a GbE link at 100MB/sec: free memory decreases by 
~100MB per second.
seconds later, free memory reaches 2.8GB
next second, the free memory has gone back to 3.8GB (recovered).
Restart above cycle continuously

Some more detailed informations about the conditions to reproduce the issue:

* The issue only happens on 3.5+ kernels. Works fine on 3.4 kernels.
* The issue only happens using IO meter, it does not happen using xdd 
and similar IO settings.
* The IO settings to reproduce the issue are: Read IOs/10IOs per target 
(queue depth: 10)/Sequential/1MB IOs (iometer config file attached).
* The issue is intermittent, but happens more often than not (~90% of 
the time so far).
* The issue happens on both 32 bits and 64 bits OSes.
* The issue does not seem to happen on Fibre Channel, only iSCSI.
* If the memory usage goes beyond the free memory, Linux's OOM killer is 
invoked.
* Swap is never used.
* Inactive (page cache) memory is not used (inactive memory does not grow).
* Nothing suspicious growing out of proportion is seen in slabtop 
(slabtop output available in the file archive).
* After stopping IOs, the memory usage returns to the same pre-IOs level.

Regarding the kernel versions, here's what we tried:

Tested on Linux 3.4 76e10d158efb6d4516018846f60c2ab5501900bc: Works 
fine, when running read IOs over iSCSI, memory usage does *not* up.
on Ubuntu 9.10 Server (x86_64) with default kernel 3.5 (Linux 
ubuntu-redintel 3.5.0-17-generic): Memory usage does grow.
on Linux 3.6.6 3820288942d1c1524c3ee85cbf503fee1533cfc3: Memory usage 
does grow.
on Linux 3.8.8 19f949f52599ba7c3f67a5897ac6be14bfcb1200: Memory usage 
does grow.

While we did reproduce the issue on a target running over a mcp_ramdisk 
backstore, we did most of our testing over an IBLOCK backstore created 
ontop of a RAID0 * 9 15K Enterprise SAS drives. The issue did occur on a 
mcp_ramdisk but it is to be noted that an IBLOCk was previously created 
during the same session, and later replaced with a mcp_ramdisk. We had 
an issue then to recreate the problem with a mcp_ramdisk later, but keep 
in mind that this could be explained by the slightly intermittent nature 
of the problem. So while I *think* the issue should happen regardless of 
the backstore, if having trouble reproducing it I would tend to try 
reproducing it on an IBLOCK backstore.

To see if the issue occurs, simply run "watch -n1 head /proc/meminfo" 
and observe the free memory dropping fast as IOs are being run.

Here's a link to a file archive containing various debug information:

https://www.dropbox.com/s/gpb4vh82bcsbi75/lio_iscsi_captures.tar.gz

Here are what all of the files within this archive correspond to:

iometer.1M.qdepth10.read.seq.pcap: 10MB "Wireshark" capture of the 
traffic to and from the LIO target while the issue was occuring.
iometer.icf: iometer config used to reproduce the issue.
1M.qdepth10.read.seq.iostat: Dump of iostat -xm 2 on the Ubuntu target 
system while the issue was occuring.
1M.qdepth10.read.seq.slabtop: Sample output of slabtop while the issue 
was occuring. Slabtop refreshes every few seconds but I could not detect 
any values out of the ordinary or with large amount of RAM (no counter 
using more than 20MB and most less than 15MB large).
targetcli_config: Copy of the targetcli config after running 
"saveconfig" in targetcli.
iblock_backstore: Details about the Linux RAID backstore configuration 
and its disk members.

Something of interest: The memory used is not directly proportional to 
the number of initiators nor the speed of the IOs.
Running read IOs using one initiator on a GbE link at 100MB/sec yields 
losing IOs at 100MB/sec for a total of 1GB, while running read IOs with 
two more initiators, one of which on a 10GbE link, at an aggregate speed 
of ~4-500MB/sec yields losing IOs at a slightly faster than 100MB/sec 
rate for a total of 2GB of RAM.

Also, when stopping an initiator from running IOs, a larger memory drop 
is observed for the next second or so as, maybe, the pending IOs from 
that initiator are all flusing (entire suppositions based on nothing 
concrete here).

Another odd thing is that the issue is intermittent. The pattern for the 
intermittence is pretty hard to figure out (most times testing the issue 
happened, but if it didn't, a restart on both sides would usually "make 
it" happen). However, when testing with three separate initiators on 3 
separate targets on the same system (one on a GbE link, the second and 
third ones on the two GbE links, 3 separate physical initiators running 
different versions of Windows and IO Meter), the issue was happening, 
but somehow if stopping the two GbE initiators, the 10GbE IOs would 
"stabilize" and the memory would become constants again. Restarting the 
other two initiators would restart the issue. I could reproduce this 
without fail on that particular session.

In conclusion, most of the times, when using IO meter to run intensive 
(high queue depth, large IOs) read IOs, the issue will appear.
The fact that it never happened on 3.4 and always happens after 3.5 
points to a change that was introduced in between.
It seems to be that LIO is using the memory for some sort of caching and 
that the memory is being released later than it should. The fact that it 
recovers after a few seconds is also interesting.

That's quite a lot of information so I hope this is not too confusing.

I hope you can manage to reproduce the issue on your side so that you 
can see with more clarity what the issue is about.

Thanks a million in advance for your help!

Regards,

Ben@MPSTOR
--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html