Re: [PATCHSET 0/5] Peaceful co-existence of scsi_sgtable and Large IO sg-chaining

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Boaz Harrosh wrote:
> FUJITA Tomonori wrote:
>> From: Benny Halevy <bhalevy@xxxxxxxxxxx>
>> Subject: Re: [PATCHSET 0/5] Peaceful co-existence of scsi_sgtable and Large IO sg-chaining
>> Date: Wed, 25 Jul 2007 11:26:44 +0300
>>
>>>> However, I'm perfectly happy to go with whatever the empirical evidence
>>>> says is best .. and hopefully, now we don't have to pick this once and
>>>> for all time ... we can alter it if whatever is chosen proves to be
>>>> suboptimal.
>>> I agree.  This isn't a catholic marriage :)
>>> We'll run some performance experiments comparing the sgtable chaining
>>> implementation vs. a scsi_data_buff implementation pointing
>>> at a possibly chained sglist and let's see if we can measure
>>> any difference.  We'll send results as soon as we have them.
>> I did some tests with your sgtable patchset and the approach to use
>> separate buffer for sglists. As expected, there was no performance
>> difference with small I/Os. I've not tried very large I/Os, which
>> might give some difference.
>>
> 
> Next week I will try to mount lots of scsi_debug devices and
> use large parallel IO to try and find a difference. I will
> test Jens's sglist-arch tree against above sglist-arch+scsi_sgtable
> 


I was able to run some tests here are my results.

The results:
PPT - is Pages Per Transfer (sg_count)

The numbers are accumulated time of 20 transfers of 32GB each,
and the average of 4 such runs. (Lower time is better)
Transfers are sg_dd into scsi_debug

Kernel         | total time 128-PPT | total time 2048-PPT
---------------|--------------------|---------------------
sglist-arch    |      47.26         | Test Failed
scsi_data_buff |      41.68         | 35.05
scsi_sgtable   |      42.42         | 36.45


The test:
1. scsi_debug
  I mounted the scsi_debug module which was converted and fixed for 
  chaining with the following options:
  $ modprobe scsi_debug virtual_gb=32 delay=0 dev_size_mb=32 fake_rw=1
  
  32 GB of virtual drive on 32M of memory with 0 delay
  and read/write do nothing with the fake_rw=1.
  After that I just enabled chained IO on the device

  So what I'm actually testing is only sg + scsi-ml request
  queuing and sglist allocation/deallocation. Which is what I want
  to test.

2. sg_dd
  In the test script (see prof_test_scsi_debug attached)
  I use sg_dd in direct io mode to send a direct scsi-command
  to above device.
  I did 2 tests, in both I transfer 32GB of data.
  1st test with 128 (4K) pages IO size.
  2nd test with 2048 pages IO size.
  The second test will successfully run only if chaining is enabled
  and working. Otherwise it will fail.

The tested Kernels:

1. Jens's sglist-arch
  I was not able to pass all tests with this Kernel. For some reason when
  bigger than 256 pages commands are queued the Machine will run out
  of memory and will kill the test. After the test is killed the system
  is left with 10M of memory and can hardly reboot.
  I have done some prints at the queuecommand entry in scsi_debug.c
  and I can see that I receive the expected large sg_count and bufflen
  but unlike other tests I get a different pointer at scsi_sglist().
  In other tests since nothing is happening at this machine while in
  the test, the sglist pointer is always the same. commands comes in,
  allocates memory, do nothing in scsi_debug, freed, and returns. 
  I suspect sglist leak or allocation bug.

2. scsi_data_buff
  This tree is what I posted last. It is basically: 
  0. sglist-arch
  1. revert of scsi-ml support for chaining.
  2. sg-pools cleanup [PATCH AB1]
  3. scsi-ml sglist-arch [PATCH B1]
  4. scsi_data_buff patch. scsi_lib.c (Last patch sent)
  5. scsi_data_buff patch for sr.c sd.c & scsi_error.c
  6. Plus converted libata, ide-scsi, so Kernel can compile.
  7. convert of scsi_debug.c and fix for chaining.
  ( see http://www.bhalevy.com/open-osd/download/scsi_data_buff)

  All Tests run

3. scsi_sgtable 
  This tree is what I posted as patches that open this mailing thread.
  0. sglist-arch
  1. revert of scsi-ml support for chaining.
  2. sg-pools cleanup [PATCH AB1]
  3. sgtable [PATCH A2]
  3. chaining [PATCH A3]
  4. scsi_sgtable for sd sr and scsi_error
  6. Converted libata ide-scsi so Kernel can compile.
  7. convert of scsi_debug.c and fix for chaining.
  ( see http://www.bhalevy.com/open-osd/download/scsi_sgtable/linux-block/)

  All Tests run

#!/bin/sh
sdx=sdb
#load the device with these params
modprobe scsi_debug virtual_gb=32 delay=0 dev_size_mb=32 fake_rw=1

# go set some live params
# $ cd /sys/bus/pseudo/drivers/scsi_debug
# $ echo 1 > fake_rw

# mess with sglist chaining
cd /sys/block/$sdx/queue
echo 4096 > max_segments
cat max_hw_sectors_kb  > max_sectors_kb
echo "max_hw_sectors_kb="$(cat max_hw_sectors_kb) 
echo "max_hw_sectors_kb="$(cat max_sectors_kb) 
echo "max_hw_sectors_kb="$(cat max_segments)
#!/bin/sh

#load the device with these params
#$ modprobe scsi_debug virtual_gb=32 delay=0 dev_size_mb=32 fake_rw=1

# go set some live params
# $ cd /sys/bus/pseudo/drivers/scsi_debug
# $ echo 1 > fake_rw

# mess with sglist chaining
# $ cd /sys/block/sdb/queue
# $ echo 4096 > max_segments
# $ cat max_hw_sectors_kb  > max_sectors_kb
# $ cat max_hw_sectors_kb 


if=/dev/zero
of=/dev/sdb

outputfile=$1.txt
echo "Testing $1"

# send 32G in $1 sectrors at once
do_dd()
{
# blocks of one sector
bs=512
#memory page in blocks
page=8
#number of scatterlist elements in a transfer
sgs=$1
#calculate the bpt param
bpt=$(($sgs*$page))
#total blocks to transfer 32 Giga bytes
count=64M


echo $3: "bpt=$bpt"

\time bash -c \
	"sg_dd blk_sgio=1 dio=1 if=$if of=$of bpt=$bpt bs=$bs count=$count 2>/dev/null" \
	2>> $2
}

echo "BEGIN RUN $1" >> $outputfile

# warm run
for i in {1..5}; do
do_dd 2048 /dev/null $i;
done

# one page trasfers
echo "one page transfers"
echo "one page transfers" >> $outputfile
for i in {1..20}; do
do_dd 128 $outputfile $i;
done

# chained
# 16K / 8 = 2K pages
# 2K / 128 = 16 chained sglists
echo "16 chained sglists"
echo "16 chained sglists" >> $outputfile
for i in {1..20}; do
do_dd 2048 $outputfile $i;
done

echo "END RUN" >> $outputfile

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux