Re: fileio + wb cache streaming write performance on raid5

Vladislav Bogdanov <v.bogdanov@xxxxxxxxxxxxxxxxx> · Wed, 26 Jun 2013 12:55:31 +0300

26.06.2013 11:48, Nicholas A. Bellinger wrote:
> Hi Vladislav,
>
> On Mon, 2013-06-24 at 18:18 +0300, Vladislav Bogdanov wrote:
>> Hi,
>>
>> I'm evaluating performance of different targets (actually LIO and IET)
>> on top of RAID5
>> (mdraid) for my customer.
>>
>> In this particular test (streaming write in several threads) load is
>> provided by a windows 7
>> machine with robocopy and its default settings (8 threads).
>>
>> As expected, blockio writes are slow for both targets (~33 Mb/s), fileio
>> with wb cache performs better.
>>
>> What is really weird, is that kernel version affects fileio+wb a lot,
>> but in a different directions
>> for targets.
>>
>> I have similar iscsi parameters for both targets (except 3.4's LIO misses
>> MaxRecvDataSegmentLength), and I set Wthreads=2 for IET.
>>
>> What I see:
>>
>> IET (with fileio+wb) shows:
>>
>>   * 75 MB/s with kernel 3.4 (from debian)
>>   * 85 MB/s with kernel 3.9
>>
>> LIO (with fileio+wb) shows:
>>
>>   * 63 MB/s with kernel 3.4 (from debian)
>>   * 54 MB/s with kernel 3.9
>>
>> Is there any explanation for LIO performance degradation with the kernel
>> upgrade?
>>

My fault, that is 3.2.41, not 3.4.

> Strange.  Can you verify using a TPG attribute default_cmdsn_depth value
> larger than the hardcoded default of 16..?
>
> IIRC, IET is using a larger CmdSN window by default here, so you'll want
> to increase default_cmdsn_depth=128 with this type of workload.
Already tried that, that was the first suspect. Unfortunately no luck.

Some more observations:
With IET iostat on a target host shows much smoother picture, less than
10Mb/s
peaks (from the median).
With LIO peaks are much bigger, looks like something forces IO (many
partial stripes)
to be flushed at the improper point of time.

The same is seen on the initiator side, robocopy shows percentage
progress while
copying, and with IET it goes very smooth. With LIO that progress is
some-how "jaggy".

I see that IET does flushes itself, while LIO leaves that to other
kernel subsystem
(or at least I didn't find where it calls flush).
May that be a point?

> Also, verifying with a RAMDISK_MCP backend on the same setup would be
> useful for determining if it's a FILEIO specific performance issue.
ramdisk works at the wire speed both with RAMDISK_MCP and loop device on
a tmpfs
(with both iblock and fileio).
And, I wouldn't say it is sole FILEIO problem, but problem of iSCSI +
mdRAID[56].
I already spent much time on this, and it seems that IET somehow almost
guaranties
that with fileio+wb only complete stripes are put on the media with this
type of load,
while all other variants (IET with fileio+wt, IET with blockio, LIO with
fileio (wt of wb),
LIO with iblock) do partial stripe writes, which are very expensive for
RAID5/6.

Another point may be that mdraid assumes that there is always a local
filesystem on
top of it, which is not a case with iSCSI.

But, again, IET magically does the trick - 85Mb/s is very close to both
wire speed and to
expected maximal raid5 write speed when IO size is equal to stripe size,
so writing the
full stripe costs only 4 IOs (2 reads and 2 writes).

I have 64k stripe and robocopy *seems* to use 4k IO. And, in all cases
except IET fileio+wb
I have a little bit less than 64 IOs to write the full stripe, but with
later I'm close to ideal
4 IOs. Of course I exaggerate a bit, but I hope that should help to
locate a problem.

Vladislav

-- 
Vladislav Bogdanov
Systems Architect

tel.: +375 17 3091709
fax:  +375 17 3091717
mob.: +375 29 6887526

E-mail: v.bogdanov@xxxxxxxxxxxxxxxxx

SaM Solutions
Minsk office, Belarus (GMT+3)
www.sam-solutions.net
Value of Talent. Delivered.

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html