Re: High update activity, PostgreSQL vs BigDBMS

"Jim C. Nasby" <jim@xxxxxxxxx> · Tue, 9 Jan 2007 06:54:27 -0600

On Sun, Jan 07, 2007 at 11:26:01PM -0500, Guy Rouillier wrote:
> Ok, I ran with the settings below, but with
> 
> shared_buffers=768MB
> effective_cache_size=2048MB
> fsync=on
> 
> This run took 29000 seconds.  I'm beginning to think configuration 
> changes are not going to buy significant additional improvement.  Time 
> to look at the app implementation.

Very likely, but one thing I haven't seen mentioned is what your
bottleneck actually is. Is it CPU? Disk? Something else (ie: neither CPU
or IO is at 100%). Additionally, since you have multiple arrays, are you
sure they're being utilized equally? Having something like MRTG or
cricket will make your tuning much easier. Unlike Oracle, PostgreSQL has
no ability to avoid hitting the base table even if an index could cover
a query... so compared to Oracle you'll need to dedicate a lot more IO
to the base tables.

Search around for PostgreSQL on Solaris tuning tips... there's some
OS-settings that can make a huge difference. In particular, by default
Solaris will only dedicate a fraction of memory to disk caching. That
won't bother Oracle much but it's a big deal to PostgreSQL. I think
there's some other relevant OS parameters as well.

For vacuum, you're going to need to tune the vacuum_cost_* settings so
that you can balance the IO impact of vacuums with the need to complete
the vacuums in a reasonable time. You'll find this easiest to tune by
running manual vacuums and monitoring IO activity.

You'll also likely need to tune the bgwriter so that checkpoints aren't
killing you. If you're targeting a checkpoint every 5 minutes you'll
need to at least up bgwriter_all_maxpages to shared_buffers (in pages) /
300 / 5. I'd round up a bit. As with everything, you'll need to tweak
your values from there. If you're using stock bgwriter settings then
you'll probably be seeing a big IO spike every time a checkpoint occurs.

Speaking of which... how often are checkpoints? If you can tolerate 5
minutes of recovery time, (the default checkpoint_timeout), I suggest
setting checkpount_warning to 290 seconds or so; that way if you're
getting checkpoints much more often than every 5 minutes you'll be able
to see in the logs.

Speaking of which, going longer between checkpoints will likely help
performance, if you can tolerate longer recovery times. I haven't
actually tested the correlation, but I would expect recovery to complete
in a maximum of checkpount_timeout seconds. If you can tolerate being in
recovery mode for 10 minutes after a crash, try bumping
checkpount_timeout, checkpount_warning and checkpoint_segments and see
what it does for performance (if you do that you'll also want to tweak
bgwriter further... in this case increasing bgwriter_delay would be
easiest).

Given what sounds like decent IO capabilities, you'll likely get better
query plans from decreasing random_page_cost, probably to between 2 and
3.

Speaking of IO... if you can switch to RAID10 you'll likely get better
preformance since your write load is so heavy. Normally RAID5 is a
complete performance killer as soon as you're doing much writing, but
I'm guessing that those nice expensive Sun arrays are better than most
RAID controllers.

All that being said... generally the biggest tuning impact to be had for
any database environment is in how the application is using the
database. A few sub-optimal things in the application/database design
could easily erase every gain you'll get from all your tuning. I suggest
running EXPLAIN ANALYZE on the queries that are run most often and
seeing what that shows.
-- 
Jim Nasby                                            jim@xxxxxxxxx
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)