Hi All,
As part of 1.0.1 release preparations I made some performance tests to
make sure there are no performance regressions in SCST overall and
iSCSI-SCST particularly. Results were quite interesting, so I decided to
publish them together with the corresponding numbers for IET and STGT
iSCSI targets. This isn't a real performance comparison, it includes
only few chosen tests, because I don't have time for a complete
comparison. But I hope somebody will take up what I did and make it
complete.
Setup:
Target: HT 2.4GHz Xeon, x86_32, 2GB of memory limited to 256MB by kernel
command line to have less test data footprint, 75GB 15K RPM SCSI disk as
backstorage, dual port 1Gbps E1000 Intel network card, 2.6.29 kernel.
Initiator: 1.7GHz Xeon, x86_32, 1GB of memory limited to 256MB by kernel
command line to have less test data footprint, dual port 1Gbps E1000
Intel network card, 2.6.27 kernel, open-iscsi 2.0-870-rc3.
The target exported a 5GB file on XFS for FILEIO and 5GB partition for
BLOCKIO.
All the tests were ran 3 times and average written. All the values are
in MB/s. The tests were ran with CFQ and deadline IO schedulers on the
target. All other parameters on both target and initiator were default.
==================================================================
I. SEQUENTIAL ACCESS OVER SINGLE LINE
1. # dd if=/dev/sdX of=/dev/null bs=512K count=2000
ISCSI-SCST IET STGT
NULLIO: 106 105 103
FILEIO/CFQ: 82 57 55
FILEIO/deadline 69 69 67
BLOCKIO/CFQ 81 28 -
BLOCKIO/deadline 80 66 -
------------------------------------------------------------------
2. # dd if=/dev/zero of=/dev/sdX bs=512K count=2000
I didn't do other write tests, because I have data on those devices.
ISCSI-SCST IET STGT
NULLIO: 114 114 114
------------------------------------------------------------------
3. /dev/sdX formatted in ext3 and mounted in /mnt on the initiator. Then
# dd if=/mnt/q of=/dev/null bs=512K count=2000
were ran (/mnt/q was created before by the next test)
ISCSI-SCST IET STGT
FILEIO/CFQ: 94 66 46
FILEIO/deadline 74 74 72
BLOCKIO/CFQ 95 35 -
BLOCKIO/deadline 94 95 -
------------------------------------------------------------------
4. /dev/sdX formatted in ext3 and mounted in /mnt on the initiator. Then
# dd if=/dev/zero of=/mnt/q bs=512K count=2000
were ran (/mnt/q was created by the next test before)
ISCSI-SCST IET STGT
FILEIO/CFQ: 97 91 88
FILEIO/deadline 98 96 90
BLOCKIO/CFQ 112 110 -
BLOCKIO/deadline 112 110 -
------------------------------------------------------------------
Conclusions:
1. ISCSI-SCST FILEIO on buffered READs on 27% faster than IET (94 vs
74). With CFQ the difference is 42% (94 vs 66).
2. ISCSI-SCST FILEIO on buffered READs on 30% faster than STGT (94 vs
72). With CFQ the difference is 104% (94 vs 46).
3. ISCSI-SCST BLOCKIO on buffered READs has about the same performance
as IET, but with CFQ it's on 170% faster (95 vs 35).
4. Buffered WRITEs are not so interesting, because they are async. with
many outstanding commands at time, hence latency insensitive, but even
here ISCSI-SCST always a bit faster than IET.
5. STGT always the worst, sometimes considerably.
6. BLOCKIO on buffered WRITEs is constantly faster, than FILEIO, so,
definitely, there is a room for future improvement here.
7. For some reason assess on file system is considerably better, than
the same device directly.
==================================================================
II. Mostly random "realistic" access.
For this test I used io_trash utility. For more details see
http://lkml.org/lkml/2008/11/17/444. To show value of target-side
caching in this test target was ran with full 2GB of memory. I ran
io_trash with the following parameters: "2 2 ./ 500000000 50000000 10
4096 4096 300000 10 90 0 10". Total execution time was measured.
ISCSI-SCST IET STGT
FILEIO/CFQ: 4m45s 5m 5m17s
FILEIO/deadline 5m20s 5m22s 5m35s
BLOCKIO/CFQ 23m3s 23m5s -
BLOCKIO/deadline 23m15s 23m25s -
Conclusions:
1. FILEIO on 500% (five times!) faster than BLOCKIO
2. STGT, as usually, always the worst
3. Deadline always a bit slower
==================================================================
III. SEQUENTIAL ACCESS OVER MPIO
Unfortunately, my dual port network card isn't capable of simultaneous
data transfers, so I had to do some "modeling" and put my network
devices in 100Mbps mode. To make this model more realistic I also used
my old IDE 5200RPM hard drive capable to produce locally 35MB/s
throughput. So I modeled the case of double 1Gbps links with 350MB/s
backstorage, if all the following rules satisfied:
- Both links a capable of simultaneous data transfers
- There is sufficient amount of CPU power on both initiator and target
to cover requirements for the data transfers.
All the tests were done with iSCSI-SCST only.
1. # dd if=/dev/sdX of=/dev/null bs=512K count=2000
NULLIO: 23
FILEIO/CFQ: 20
FILEIO/deadline 20
BLOCKIO/CFQ 20
BLOCKIO/deadline 17
Single line NULLIO is 12.
So, there is a 67% improvement using 2 lines. With 1Gbps it should be
equivalent of 200MB/s. Not too bad.
==================================================================
Connection to the target were made with the following iSCSI parameters:
# iscsi-scst-adm --op show --tid=1 --sid=0x10000013d0200
InitialR2T=No
ImmediateData=Yes
MaxConnections=1
MaxRecvDataSegmentLength=2097152
MaxXmitDataSegmentLength=131072
MaxBurstLength=2097152
FirstBurstLength=262144
DefaultTime2Wait=2
DefaultTime2Retain=0
MaxOutstandingR2T=1
DataPDUInOrder=Yes
DataSequenceInOrder=Yes
ErrorRecoveryLevel=0
HeaderDigest=None
DataDigest=None
OFMarker=No
IFMarker=No
OFMarkInt=Reject
IFMarkInt=Reject
# ietadm --op show --tid=1 --sid=0x10000013d0200
InitialR2T=No
ImmediateData=Yes
MaxConnections=1
MaxRecvDataSegmentLength=262144
MaxXmitDataSegmentLength=131072
MaxBurstLength=2097152
FirstBurstLength=262144
DefaultTime2Wait=2
DefaultTime2Retain=20
MaxOutstandingR2T=1
DataPDUInOrder=Yes
DataSequenceInOrder=Yes
ErrorRecoveryLevel=0
HeaderDigest=None
DataDigest=None
OFMarker=No
IFMarker=No
OFMarkInt=Reject
IFMarkInt=Reject
# tgtadm --op show --mode session --tid 1 --sid 1
MaxRecvDataSegmentLength=2097152
MaxXmitDataSegmentLength=131072
HeaderDigest=None
DataDigest=None
InitialR2T=No
MaxOutstandingR2T=1
ImmediateData=Yes
FirstBurstLength=262144
MaxBurstLength=2097152
DataPDUInOrder=Yes
DataSequenceInOrder=Yes
ErrorRecoveryLevel=0
IFMarker=No
OFMarker=No
DefaultTime2Wait=2
DefaultTime2Retain=0
OFMarkInt=Reject
IFMarkInt=Reject
MaxConnections=1
RDMAExtensions=No
TargetRecvDataSegmentLength=262144
InitiatorRecvDataSegmentLength=262144
MaxOutstandingUnexpectedPDUs=0
Vlad
--
To unsubscribe from this list: send the line "unsubscribe stgt" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html