Re: Migrated from 8.3 to 9.0 - need to update config (re-post)

Carlo Stonebanks <stonec.register@xxxxxxxxxxxx> · Sat, 10 Sep 2011 10:54:37 -0400

Hi Kevin,

First, thanks for taking the time. I wish I could write back with quick,
terse questions to your detailed reply - but I'm sorry, this is still going
to be a wordy post.

>> max_connections = 300
>Too high.  Both throughput and latency should improve with correct use of
>a connection pooler.

Even for 300 stateful applications that can remain connected for up to a
week, continuously distilling data (imports)? The 300 is overkill, a sys
admin raised it from 100 when multiple large projects were loaded and the
server refused the additional connections. We can take large imports and
break them into multiple smaller ones which the operators are doing to try
and improve import performance. It does result in some improvement, but I
think they have gone over the top and the answer is to improve DB and OS
performance. Perhaps I don't understand how connection pooling will work
with stateful apps that are continuously reading and writing (the apps are
DB I/O bound).

> you want the controller configured for write-back (with automatic switch
> to write-through on low or failed battery, if possible).

For performance or safety reasons? Since the sys admin thinks there's no
performance benefit from this, I would like to be clear on why we should do
this.

>> Can our particular setup benefit from changing the bgwriter values?
> Probably not.  If you find that your interactive users have periods
> where queries seem to "freeze" for a few minutes at a time and then
> return to normal levels of performance, you might need to make this
> more aggressive.

We actually experience this. Once again, remember the overwhelming use of
the system is long-running import threads with continuous connections. Every
now and then the imports behave as if they are suddenly taking a deep
breath, slowing down. Sometimes, so much we cancel the import and restart
(the imports pick up where they left off).

What would the bg_writer settings be in this case?

Thanks again for your time,

Carlo

-----Original Message-----
From: Kevin Grittner [mailto:Kevin.Grittner@xxxxxxxxxxxx] 
Sent: September 9, 2011 2:16 PM
To: pgsql-performance@xxxxxxxxxxxxxx; Carlo Stonebanks
Subject: Re:  Migrated from 8.3 to 9.0 - need to update config
(re-post)

Carlo Stonebanks <stonec.register@xxxxxxxxxxxx> wrote:

> this is a full-time ETL system, with only a handful of actual
> *users* and automated processes over 300 connections running
> *import* programs 24/7

> Intel* Xeon* Processor X5560 (8M Cache, 2.80 GHz, 6.40 
> GT/s Intel* QPI) x 2, dual quad core 48 GB RAM
> RAID 10, 6 X 600 GB 15krpm SAS)

So, eight cores and six spindles.  You are probably going to see
*much* better throughput if you route those 300 workers through
about 22 connections.  Use a connection pooler which limits active
transactions to that and queues up requests to start a transaction.

> Sys admin says that battery-backup RAID controller and 
> consequent write settings should have no impact on performance.

With only six drives, I your OS, WAL files, indexes, and heap files
are all in the same RAID?  If so, your sys admin is wrong -- you
want the controller configured for write-back (with automatic switch
to write-through on low or failed battery, if possible).

> max_connections = 300

Too high.  Both throughput and latency should improve with correct
use of a connection pooler.

> shared_buffers = 
> 500MB        # At 48GB of RAM, could we go to 2GB

You might benefit from as much as 8GB, but only testing with your
actual load will show for sure.

> effective_cache_size = 
> 2457MB # Sys admin says assume 25% of 48GB

Add together the shared_buffers setting and whatever the OS tells
you is used for cache under your normal load.  It's usually 75% of
RM or higher.  (NOTE: This doesn't cause any allocation of RAM; it's
a hint to the cost calculations.)

> work_mem = 
> 512MB              # Complex reads are called many times a second

Maybe, if you use the connection pooler as described above.  Each
connection can allocate this multiple times.  So with 300
connections you could very easily start using 150GB of RAM in
addition to your shared buffers; causing a swap storm followed by
OOM crashes.  If you stay with 300 connections this *must* be
reduced by at least an order of magnitude.

> # from each connection, so what should this be?
> maintenance_work_mem = 
> 256MB  # Should this be bigger - 1GB at least?

I'd go to 1 or 2 GB.

> checkpoint_segments = 
> 128     # There is lots of write activity; this is high 

OK

> # but could it be higher?

IMO, there's unlikely to be much benefit beyond that.

> #checkpoint_completion_target not set; 
> #  Recommendation appears to be .9 for our 128 checkpoint segments

0.9 is probably a good idea.

> default_statistics_target = 
> 200  # Deprecated?

Depends on your data.  The default is 100.  You might want to leave
that in general and boost it for specific columns where you find it
is needed.  Higher values improve estimates and can lead to better
query plans, but boost ANALYZE times and query planning time.

> # What is the metric for wal_buffers setting?
> wal_buffers = 
> 4MB             # Looks low, recommendation appears to be 16MB.

16MB is good.

> # Is it really "set it and forget it"?

Yeah.

> #synchronous_commit not set; 
> 
> #  Recommendation is to turn this off and leave fsync on

If this is off, it makes lack of write-back on the controller a lot
less painful.  Even with write-back it can improve performance some.
It does mean that on a crash you can lose some committed
transactions (typically less than a second's worth), but you will
still have database integrity.

> #fsync not set; 
> 
> #  Recommendation is to leave this on

Unless you want to rebuild your database from scratch or restore
from backup on an OS crash, leave this on.

> #wal_level not set; 
> 
> #  Do we only needed for replication?

The lowest level just supports crash recovery.  The next level
supports archiving, for recovery from a PITR-style backup.  The
third level is needed to support hot standby (a replicated server on
which you can run targets as it is updated).

> # The issue of vacuum/analyze is a tricky one.
> #  Data imports are running 24/7. One the DB is seeded, the vast
> #  majority of write activity is updates, and not to indexed
> #  columns.  Deletions are vary rare.
> vacuum_cost_delay = 
> 20ms

You could try that.  I would monitor for bloat and make things more
aggressive if needed.  If you are not vacuuming aggressively enough,
performance will slowly degrade.  If you let it go too far, recovery
can be a lot of work.

> # The background writer has not been addressed at all.
> #  Can our particular setup benefit from changing the bgwriter
> #  values?

Probably not.  If you find that your interactive users have periods
where queries seem to "freeze" for a few minutes at a time and then
return to normal levels of performance, you might need to make this
more aggressive.

-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance