Re: PostgreSQL reads each 8k block - no larger blocks are used - even on sequential scans

Scott Marlowe <scott.marlowe@xxxxxxxxx> · Mon, 28 Sep 2009 14:48:39 -0600

On Mon, Sep 28, 2009 at 5:53 AM, Sam Mason <sam@xxxxxxxxxxxxx> wrote:
> On Sun, Sep 27, 2009 at 07:22:47PM -0600, Scott Marlowe wrote:
>> >> dd if=/dev/zero of=test.txt bs=8192 count=1310720 conv=fdatasync
>> >> 10737418240 bytes (11 GB) copied, 169.482 s, 63.4 MB/s
>> >>
>> >> dd if=test.txt of=/dev/null bs=8192
>> >> 10737418240 bytes (11 GB) copied, 86.4457 s, 124 MB/s
>> >
>> > These look slow.
>
>> They are slow, they are not atypical for RAID5; especially the slow
>> writes with SW RAID-5 are typical.
>
> Wow, no wonder it's shunned so much here!  I'd not realized before that
> it incurred such a hit.
>
>> I'd try a simple test on a 2 or 3 disk RAID-0 for testing purposes
>> only to see how much faster a RAID-10 array of n*2 disks could be.
>> The increase in random write performance for RAID-10 will be even more
>> noticeable.
>
> I was thinking that the higher the bandwidth the IO subsystem could push
> the data though the more important a larger block size would be--less
> to and fro between the kernel and userspace.  If the OP reported
> considerably higher CPU usage than expected then he could try rebuilding
> with larger block sizes to see if it helps.
>
> I'm assuming that PG only issues block sized reads?  How does changing
> block size affect index access performance; does it slow it down because
> it has to pull the whole block in?

My experience has been that the stripe size of the RAID array is what
matters. It's about a compromise between something that works well
with sequential scans and something that works well with random access
at the same time.  On a purely transactional db, having a stripe size
in the 8k to 64k range seems optimal, depending on the RAID setup /
hardware.  For something on a reporting database, getting the stripe
size in the 32k to 512k range is usually best.  Most of the time I hit
32 or 64k stripes and it's decent at both.

Random IO is the killer.  If you've got at least 10 to 20% random IO,
it's what to tune for usually.

I haven't played much with larger pg blocksize on pgsql much in recent
years.  I remember someone doing so and seeing better performance up
to 32k, but that's the largest block size pg supports I believe, and
since that code path isn't as well tested as 8k, I just stick to 8k.

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general