Re: count * performance issue

Greg Smith <gsmith@xxxxxxxxxxxxx> · Mon, 10 Mar 2008 19:01:43 -0400 (EDT)

On Mon, 10 Mar 2008, Joe Mirabal wrote:

I run queries on the data nad get immediate max and min as well as other 
aggrgate functions very quickly, however a select count(*) of the table 
takes forever usually nearly an hour or more.

Are you sure the form of "select count(*)" you're using is actually 
utilizing the index to find a useful subset?  What do you get out of 
EXPLAIN ANALZYE on the query?

In order for indexes to be helpful a couple of things need to happen:
1) They have to be structured correctly to be useful
2) There needs to be large enough settings for shared_buffes and 
effective_cache_size that the database things it can use them efficiently
3) The tables involved need to be ANALYZEd to keep their statistics up to 
date.

The parameters to run a 400GB *table* are very different from the 
defaults; if you want tuning suggestions you should post the non-default 
entries in your postgresql.conf file from what you've already adjusted 
along with basic information about your server (PostgreSQL version, OS, 
memory, disk setup).

We in our warehouse use the count(*) as our verification of counts by 
day/month's etc

If you've got a database that size and you're doing that sort of thing on 
it, you really should be considering partitioning as well.

 --
* Greg Smith gsmith@xxxxxxxxxxxxx http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance