On 24/09/10 09:50, Tobias Brox wrote:
We've come to a tipping point with one of our database servers, it's
generally quite loaded but up until recently it was handling the load
well - but now we're seeing that it struggles to process all the
selects fast enough. Sometimes we're observing some weird lock-like
behaviour (see my other post on that), but most of the time the
database server is just not capable of handling the load fast enough
(causing the queries to pile up in the pg_stat_activity-view).
My main hypothesis is that all the important indexes would fit snuggly
into the memory before, and now they don't. We'll eventually get the
server moved over to new and improved hardware, but while waiting for
that to happen we need to do focus on reducing the memory footprint of
the database. I have some general questions now ...
1) Are there any good ways to verify my hypothesis? Some months ago I
thought of running some small memory-gobbling program on the database
server just to see how much memory I could remove before we would see
indications of the database being overloaded. It seems a bit radical,
but I think the information learned from such an experiment would be
very useful ... and we never managed to set up any testing environment
that faithfully replicates production traffic. Anyway, it's sort of
too late now that we're already observing performance problems even
without the memory gobbling script running.
2) I've seen it discussed earlier on this list ... shared_buffers vs
OS caches. Some claims that it has very little effect to adjust the
size of the shared buffers. Anyway, isn't it a risk that memory is
wasted because important data is stored both in the OS cache and the
shared buffers? What would happen if using almost all the available
memory for shared buffers? Or turn it down to a bare minimum and let
the OS do almost all the cache handling?
3) We're discussing to drop some overlapping indexes ... i.e. to drop
one out of two indexes looking like this:
some_table(a)
some_table(a,b)
Would the query "select * from some_table where a=?" run slower if we
drop the first index? Significantly?
(in our situation I found that the number of distinct b's for each a
is low and that the usage stats on the second index is quite low
compared with the first one, so I think we'll drop the second index).
4) We're discussing to drop other indexes. Does it make sense at all
as long as we're not experiencing problems with inserts/updates? I
suppose that if the index isn't used it will remain on disk and won't
affect the memory usage ... but what if the index is rarely used ...
wouldn't it be better to do a seqscan on a table that is frequently
accessed and mostly in memory than to consult an index that is stored
on the disk?
Sorry for all the stupid questions ;-)
All good questions! Before (or maybe as well as) looking at index sizes
vs memory I'd check to see if any of your commonly run queries have
suddenly started to use different plans due to data growth, e.g:
- index scan to seq scan (perhaps because effective_cache_size is too
small now)
- hash agg to sort (work_mem too small now)
We had a case of the 1st point happen here a while ago, symptoms looked
very like what you are describing.
Re index size, you could try indexes like:
some_table(a)
some_table(b)
which may occupy less space, and the optimizer can bitmap and/or them to
work like the compound index some_table(a,b).
regards
Mark
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance