26.06.2013 11:48, Nicholas A. Bellinger wrote: > Hi Vladislav, > > On Mon, 2013-06-24 at 18:18 +0300, Vladislav Bogdanov wrote: >> Hi, >> >> I'm evaluating performance of different targets (actually LIO and IET) >> on top of RAID5 >> (mdraid) for my customer. >> >> In this particular test (streaming write in several threads) load is >> provided by a windows 7 >> machine with robocopy and its default settings (8 threads). >> >> As expected, blockio writes are slow for both targets (~33 Mb/s), fileio >> with wb cache performs better. >> >> What is really weird, is that kernel version affects fileio+wb a lot, >> but in a different directions >> for targets. >> >> I have similar iscsi parameters for both targets (except 3.4's LIO misses >> MaxRecvDataSegmentLength), and I set Wthreads=2 for IET. >> >> What I see: >> >> IET (with fileio+wb) shows: >> >> * 75 MB/s with kernel 3.4 (from debian) >> * 85 MB/s with kernel 3.9 >> >> LIO (with fileio+wb) shows: >> >> * 63 MB/s with kernel 3.4 (from debian) >> * 54 MB/s with kernel 3.9 >> >> Is there any explanation for LIO performance degradation with the kernel >> upgrade? >> My fault, that is 3.2.41, not 3.4. > Strange. Can you verify using a TPG attribute default_cmdsn_depth value > larger than the hardcoded default of 16..? > > IIRC, IET is using a larger CmdSN window by default here, so you'll want > to increase default_cmdsn_depth=128 with this type of workload. Already tried that, that was the first suspect. Unfortunately no luck. Some more observations: With IET iostat on a target host shows much smoother picture, less than 10Mb/s peaks (from the median). With LIO peaks are much bigger, looks like something forces IO (many partial stripes) to be flushed at the improper point of time. The same is seen on the initiator side, robocopy shows percentage progress while copying, and with IET it goes very smooth. With LIO that progress is some-how "jaggy". I see that IET does flushes itself, while LIO leaves that to other kernel subsystem (or at least I didn't find where it calls flush). May that be a point? > Also, verifying with a RAMDISK_MCP backend on the same setup would be > useful for determining if it's a FILEIO specific performance issue. ramdisk works at the wire speed both with RAMDISK_MCP and loop device on a tmpfs (with both iblock and fileio). And, I wouldn't say it is sole FILEIO problem, but problem of iSCSI + mdRAID[56]. I already spent much time on this, and it seems that IET somehow almost guaranties that with fileio+wb only complete stripes are put on the media with this type of load, while all other variants (IET with fileio+wt, IET with blockio, LIO with fileio (wt of wb), LIO with iblock) do partial stripe writes, which are very expensive for RAID5/6. Another point may be that mdraid assumes that there is always a local filesystem on top of it, which is not a case with iSCSI. But, again, IET magically does the trick - 85Mb/s is very close to both wire speed and to expected maximal raid5 write speed when IO size is equal to stripe size, so writing the full stripe costs only 4 IOs (2 reads and 2 writes). I have 64k stripe and robocopy *seems* to use 4k IO. And, in all cases except IET fileio+wb I have a little bit less than 64 IOs to write the full stripe, but with later I'm close to ideal 4 IOs. Of course I exaggerate a bit, but I hope that should help to locate a problem. Vladislav -- Vladislav Bogdanov Systems Architect tel.: +375 17 3091709 fax: +375 17 3091717 mob.: +375 29 6887526 E-mail: v.bogdanov@xxxxxxxxxxxxxxxxx SaM Solutions Minsk office, Belarus (GMT+3) www.sam-solutions.net Value of Talent. Delivered. -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html