Re: Using IOZone to simulate DB access patterns

Mark Wong <markwkm@xxxxxxxxx> · Sun, 26 Apr 2009 20:44:51 -0700

On Sat, Apr 11, 2009 at 7:00 PM, Scott Carey <scott@xxxxxxxxxxxxxxxxx> wrote:
>
>
> On 4/11/09 11:44 AM, "Mark Wong" <markwkm@xxxxxxxxx> wrote:
>
>> On Fri, Apr 10, 2009 at 11:01 AM, Greg Smith <gsmith@xxxxxxxxxxxxx> wrote:
>>> On Fri, 10 Apr 2009, Scott Carey wrote:
>>>
>>>> FIO with profiles such as the below samples are easy to set up
>>>
>>> There are some more sample FIO profiles with results from various
>>> filesystems at
>>> http://wiki.postgresql.org/wiki/HP_ProLiant_DL380_G5_Tuning_Guide
>>
>> There's a couple of potential flaws I'm trying to characterize this
>> weekend.  I'm having second thoughts about how I did the sequential
>> read and write profiles.  Using multiple processes doesn't let it
>> really do sequential i/o.  I've done one comparison so far resulting
>> in about 50% more throughput using just one process to do sequential
>> writes.  I just want to make sure there shouldn't be any concern for
>> being processor bound on one core.
>
> FWIW, my raid array will do 1200MB/sec, and no tool I've used can saturate
> it without at least two processes.  'dd' and fio can get close (1050MB/sec),
> if the block size is <= ~32k <=64k.  With a postgres sized 8k block 'dd'
> can't top 900MB/sec or so. FIO can saturate it only with two+ readers.
>
> I optimized my configuration for 4 concurrent sequential readers with 4
> concurrent random readers, and this helped the overall real world
> performance a lot.  I would argue that on any system with concurrent
> queries, concurrency of all types is important to measure.  Postgres isn't
> going to hold up one sequential scan to wait for another.  Postgres on a
> 3.16Ghz CPU is CPU bound on a sequential scan at between 250MB/sec and
> 800MB/sec on the type of tables/queries I have.  Concurrent sequential
> performance was affected by:
> Xfs -- the gain over ext3 was large
> Readahead tuning -- about 2MB per spindle was optimal (20MB for me, sw raid
> 0 on 2x[10 drive hw raid 10]).
> Deadline scheduler (big difference with concurrent sequential + random
> mixed).
>
> One reason your tests write so much faster than they read was the linux
> readahead value not being tuned as you later observed.  This helps ext3 a
> lot, and xfs enough so that fio single threaded was faster than 'dd' to the
> raw device.
>
>>
>> The other flaw is having a minimum run time.  The max of 1 hour seems
>> to be good to establishing steady system utilization, but letting some
>> tests finish in less than 15 minutes doesn't provide "good" data.
>> "Good" meaning looking at the time series of data and feeling
>> confident it's a reliable result.  I think I'm describing that
>> correctly...
>
> It really depends on the specific test though.  You can usually get random
> iops numbers that are realistic in a fairly short time, and 1 minute long
> tests for me vary by about 3% (which can be +-35MB/sec in my case).
>
> I ran my tests on a partition that was only 20% the size of the whole
> volume, and at the front of it.  Sequential transfer varies by a factor of 2
> across a SATA disk from start to end, so if you want to compare file systems
> fairly on sequential transfer rate you have to limit the partition to an
> area with relatively constant STR or else one file system might win just
> because it placed your file earlier on the drive.

That's probably what is going with the 1 disk test:

http://207.173.203.223/~markwkm/community10/fio/linux-2.6.28-gentoo/1-disk-raid0/ext2/seq-read/io-charts/iostat-rMB.s.png

versus the 4 disk test:

http://207.173.203.223/~markwkm/community10/fio/linux-2.6.28-gentoo/4-disk-raid0/ext2/seq-read/io-charts/iostat-rMB.s.png

These are the throughput numbs but the iops are in the same directory.

Regards,
Mark

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance