On Wed, Apr 8, 2009 at 10:23 AM, Bill Moran <wmoran@xxxxxxxxxxxxxxxxx> wrote:
If your vacuums and / or create index are taking ages, considering a higher value here may be useful. I would need to know more about the database before suggesting though. I have a gut feeling that this may be a good starting place.
The idea here is to be conservative with shared_buffers and then use effective_cache_size to tell the optimizer how much ram the OS can use for buffering data. 1 GB is a good start place.
Good advice
+ 1
Just remember that the REINDEX command is a locking command, so using 'create index concurrently' is recommended.
You can also use the pg_stat_all_indexes table to look at index scans vs. tuples being read, this can sometimes hint at index 'bloat'. I would also recommend pg_stattuple which has a pg_statindex function for looking at index fragmentation.
--Scott
In response to Jennifer Trey <jennifer.trey@xxxxxxxxx>:
> *maintenance_work_mem = 16384 *
If your vacuums and / or create index are taking ages, considering a higher value here may be useful. I would need to know more about the database before suggesting though. I have a gut feeling that this may be a good starting place.
>> value?*
> *work_mem = 1024 # I think this is kb. Way to low, right? What is a better
Be careful with work_mem. For every connection to the database, it is possible to consume up to work_mem.... so:
If your application makes 100 connections to the database and your work_mem =1GB, IF you are running big nasty order by's... you would be swapping 100 GB. This is a pretty extreme example, but I think it's important.
As a rule of thumb, I like to start with 1 - 2 Mb and almost always leave it there. If you're doing joins and order by's on many many gigs later on, then it could be an issue.
If your application makes 100 connections to the database and your work_mem =1GB, IF you are running big nasty order by's... you would be swapping 100 GB. This is a pretty extreme example, but I think it's important.
As a rule of thumb, I like to start with 1 - 2 Mb and almost always leave it there. If you're doing joins and order by's on many many gigs later on, then it could be an issue.
> *shared_buffers = 1024 # min 128kB or max_connections*16kB ## Also to low.
> Right? I've got 3GB to work with!*
Assuming that's equating to 1G, then the value is about right. Common
best practice is to set this value to 1/4 - 1/3 of the memory available
for PostgreSQL. You're saying you'll have ~3G for PG, so 1G is about
right to start with.
The idea here is to be conservative with shared_buffers and then use effective_cache_size to tell the optimizer how much ram the OS can use for buffering data. 1 GB is a good start place.
Once the system is up and running, you can install pg_buffercache to
monitor usage and help tune it.
Good advice
> *wal_buffers = 256 # Also kB...*
>Turn on autovacuum. I've found it's the best way to go in 99% of installs
> Please give your thoughts. I was also wondering about the Vacuum, force
> reindex and stuff. Are those things good to run once in a while? Force
> sounds a little brutal though!
(the corner cases being servers that have _very_ predictable workloads ...
in which case explicit, scheduled vacuums are better).
+ 1
REINDEXing is an occasional topic of discussion. Doing it occasionally
definitely saves disk space on frequently updated databases, but the
impact (if any) on performance is a subject for debate. I've yet to see
any drastic performance improvement from REINDEXing, but if you've got
obvious off-peak times (i.e., if nobody uses the system over weekends or
something) it probably doesn't hurt to reindex everything on a regular
schedule. Don't obsess over it, though.
Just remember that the REINDEX command is a locking command, so using 'create index concurrently' is recommended.
You can also use the pg_stat_all_indexes table to look at index scans vs. tuples being read, this can sometimes hint at index 'bloat'. I would also recommend pg_stattuple which has a pg_statindex function for looking at index fragmentation.
--Scott