Re: Lower than expected iSCSI performance compared to CIFS

"Nicholas A. Bellinger" <nab@xxxxxxxxxxxxxxx> · Mon, 19 Aug 2013 12:36:43 -0700

Hi Scott,

On Sun, 2013-08-18 at 22:43 -0600, Scott Hallowell wrote:
> I have been looking into a performance concern with the iSCSI target
> as compared to CIFS running on the same server.  The expectation was
> that iSCSI should perform somewhat similarly to Samba.  The test
> environments are Window 7 & Windows 2008 Initiators connecting to a
> target running on a Debian wheezy release (a 3.2.46 kernel).  The test
> is a file copy from Windows to the Linux server.  The source volume is
> a software Raid 0 running on Windows.  The destination is an iSCSI LUN
> on a software Raid-5 array with 4 disks (2tb WD Reds).
> 
> The write (from Windows to the CIFS/iSCSI volumes) is considerably
> slower (about half the rate) than the CIFS write.
> 
> Specifically, I see 90+ MB/s writes with Samba on both the Windows 7
> and WIndows 2008 machines (using robocopy and 5.7GB of data spread
> unevenly across about 30 files).
> 
> Performing the same tests with iSCSI and what I believe to be the
> 2.0.8 version of the Windows iSCSI initiator, I am getting closer to
> 40-45MB/s. on Windows 7 and 65 MB/s on Windows 2008.
> 
> To test the theory the issue was a Windows issue, I connected the
> Windows 7 initiator to a commercial SAN and repeated the same tests.
> I got results of around 87MB/s.  The commercial SAN was configured
> similarly to my Linux server - Raid 5, 4 2tb WD Red disks, and has
> similar hardware (intel Atom processor, e1000e NICs, although less
> physical ram: 1GB vs 2GB).
> 
> The results are fairly repeatable (+/- a couple of MB/s) and, at least
> with Windows 7, do not appear to suggest a specific issue with the
> Windows side of the equation.  The CIFS performance would suggest (to
> me, at least) that there is not a basic networking problem, either.
> 
> I've tried a number of different things in an attempt to affect the
> iSCSI performance:  changing the disk scheduling (CFQ, Deadline, and
> noop), confirming write caching is on with hdparm, tweaking vm
> parameters in the kernel, tweaking TCP and adapter parameters (both in
> Linux and Windows), etc.  Interestingly, the performance numbers do
> not seem to change by more than +/- 10%, aggregate, with enough
> variability in the results that I'd suggest the changes are
> essentially in the noise.  I will note that I have not gone to
> 9000byte MTUs, but that seems irrelevant as the commercial SAN I
> compared against wasn't using that, either.
> 
> I attempted to look at wireshark traces to identify any obvious
> patterns that might be had from the traffic.  Unfortunately, the
> amount of data required before I was able to start seeing repeatable
> differences in the aggregate rates (>400MB of file transfers) combined
> with offloading and the significant amount of caching in Windows has
> made such an analysis a bit tricky.
> 
> It seems to me that there is something mis-configured in a very basic
> way which is significantly limits the performance by a far more
> significant extreme than can be explained by simple tuning, but I am
> at a loss to understand what it is.
> 
> I am hoping that this has a ring of familiarity with someone who can
> give me some pointers on where I need to focus my attention.
> 

I recommend pursuing a few different things..

First, you'll want to bump the default_cmdsn_depth from 16 to 64.  This
is the maximum number of commands allowed in flight (per session) at any
given time.  This can be changed with 'set attrib default_cmdsn_depth
64' from within targetcli TPG context, or this can be changed on a per
NodeACL context basis if your not using TPG demo mode.

The second is to try with write cache (buffered writes) enabled.  By
default both IBLOCK and FILEIO are running without write cache enabled,
to favor strict data integrity during target power loss over backend
performance.  IBLOCK itself can set the WriteCacheEnabled=1 bit via
emulate_write_cache, but all WRITEs are still going to be submitted +
completed to the underlying storage (which may also have a cache of it's
own) before acknowledgement.

For FILEIO however, there is a buffered mode, which puts all WRITEs into
the buffer cache and acknowledges immediately, and let's VFS writeback
occur based upon /proc/sys/vm/dirty_[writeback,expire]_centisecs.  This
can be enabled during FILEIO created in targetcli by setting
'buffered=true', which depending upon your version of the target will
automatically set 'emulate_write_cache=1'.

You can verify that buffered mode is enabled in configfs with the 'Mode'
output, which depending on your kernel version should look something
like:

# cat /sys/kernel/config/target/core/fileio_0/test/info 
Status: DEACTIVATED  Execute/Max Queue Depth: 0/0  SectorSize: 512  MaxSectors: 1024
        TCM FILEIO ID: 0        File: /tmp/test  Size: 1073741824  Mode: Buffered

The third thing is to enable TCP_NODELAY on the windows side, which does
not enable this bit by default.  This needs to be enabled on a per
interface basis in the registry, and should be easy enough to find on
google.

Please let the list know your results.

--nab

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html