On 21/01/10 22:15, Tory M Blue wrote:
· Data distribution = In the 98mill records, there are 7000 unique
makeid's, and 21mill unique UID's. About 41mill of the records have
tagged=true
· Time to execute the following query with indices on makeid and
tagged = 90-120 seconds. The planner uses the webid index and filters on
tagged and then rechecks the webid index
* SELECT COUNT(DISTINCT uid ) AS active_users FROM
pixelpool.userstats WHERE makeid ='bmw-ferman' AND tagged =true*
· Time to execute the the same query with a combined index on makeid
and tagged = 60-100 seconds. The planner uses the combined index and then
filters tagged.
Two things:
1. You have got the combined index on (makeid, tagged) and not (tagged,
makeid) haven't you? Just checking.
2. If it's mostly tagged=true you are interested in you can always use a
partial index: CREATE INDEX ... (makeid) WHERE tagged
This might be a win even if you need a second index with WHERE NOT tagged.
Also, either I've not had enough cofee yet, or a bitmap scan is an odd
choice for only ~ 13000 rows out of 100 million.
* " -> Bitmap Index Scan on
idx_retargetuserstats_makeidtag (cost=0.00..360.20 rows=13175 width=0)"*
* " Index Cond: ((makeid = 'b1mw-ferman'::text)
AND (tagged = true))"*
Otherwise, see what Craig said.
I'm assuming this isn't the query that is CPU bound for a long time.
Unless your table is horribly bloated, there's no reason for that
judging by this plan.
--
Richard Huxton
Archonet Ltd
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance