Thomas Chille wrote:
Hi!
Some of our clients databases are performing less good after a while.
We are using autovacuum to vacuuming and analyzing the tables.
After some analyzes by my own it looks like that the tables or table
indexes are not analyzed or vacuumed fully or correctly.
You might have some index bloat....
A count(*) query takes multiple times longer on this databases as on a
fresh dumped/restored version of the same database on the same
machine.
Are you using a where clause with your count? If not, the PostgreSQL
must do a table scan to count the rows. If you have a lot of churn
prior to running vaccum, that might increase your table size
significantly and thus account for the additional time. You should find
that it drops back to the dump/restore time if you do a vacuum full (or
cluster), but that has other performance implications. You could also
tune AVD to be a bit more aggressive, but that might have other
performance-related implications for you.
During the query it looks like that postgres scans all the time the
harddisk and is reading a lot more data in then from the fresh
restored database. This showed me the output of the vmstat-commands
'bi'-attribute (blocks reading in) and the up to 99,9% wait-state of
the cpu.
My 1. question is,
if the known bugfixes for autovacuum after release 8.1.4 addressing my
depicted issues?
We are still using 8.1.4 because a database upgrade for us and our
product is a hefty step wich involves a lot of customer databases. But
if it could help we consider to upgrade to 8.1.11 or 8.3. What would u
suggest?
I don't think so...though HOT in 8.3 might help a bit (its a feature,
not a bug fix though), but based on what you say your application does
below, I'm not sure that you would see a benefit in HOT anyways.
My 2. questions is,
if i possible configured something improper?
Well, you might be better off using the row estimates in pg_class,
rather than doing a count() to get them. pg_class values are estimates,
and will be somewhat accurate provided you did a recent analyze...
For this i will give a brief overview of our database.
The database stores mainly historical data for a reporting
application. This data will be consolidated per day at frequent
intervals. For this the data of one day will be removed from the
historical tables and will be newly calculated out of some tables with
raw data. Depending on the daytime and the amount of data it takes
normaly up to 2 minutes to summarize an compress the data of one day.
After one minute break it starts again.
We talk about up to 3000 records per day out of up to 3 million
records (and growing) in the whole historical table.
Can autovacuum handle that much changing data with this configuration?
Sure. though you might be able to improve performance by using
PostgreSQL inheritance and putting the static data (that doesn't change
much) in a larger parition that you don't need to vacuum as frequently.
From default configuration differing settings:
vacuum_cost_delay = 200
vacuum_cost_page_hit = 6
vacuum_cost_limit = 100
autovacuum_naptime = 60
Strange enough a manual analyze and vacuum makes the measured count(*)
query less performant? See attached vacuum log.
I'm sure if you did the manual vacuum at the same interval as the
auto-vacuum you would see it be similarly performant.
My 3. question is,
if it possible to read the vaccuming or analyzing state of a given
table manually? Are there any indicatores in statistic tables, wich
the autovacuum demaon is using too wich can show me the progress of a
running autovacuum?
Sure, you can look at the table level row i/o stats. If you add the
number of rows deleted with the number of rows updated, you'll get a
count of the number of dead tuples (assuming you reset stats and didn't
do a vacuum ). AVD uses these values to determine when it needs to
vacuum and analyze...
We are using a customized debian Linux on Pentium 4 2,8 GHz
Thanks for any help!
regards, thomas
------------------------------------------------------------------------
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
--
Chander Ganesan
Open Technology Group, Inc.
One Copley Parkway, Suite 210
Morrisville, NC 27560
919-463-0999/877-258-8987
http://www.otg-nc.com
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match