On 22/01/10 18:03, Tory M Blue wrote:
On Fri, Jan 22, 2010 at 1:42 AM, Richard Huxton<dev@xxxxxxxxxxxx> wrote:
On 21/01/10 22:15, Tory M Blue wrote:
2. If it's mostly tagged=true you are interested in you can always use a
partial index: CREATE INDEX ... (makeid) WHERE tagged
This might be a win even if you need a second index with WHERE NOT tagged.
Partial index doesn't seem to fit here due to the fact that there are 35-40%
Marked True.
Didn't think about creating a second index for false, may give that a shot.
If you're mostly search tagged=true, try the partial index - it'll mean
the planner is just scanning the index for the one term.
Also, either I've not had enough cofee yet, or a bitmap scan is an odd
choice for only ~ 13000 rows out of 100 million.
* " -> Bitmap Index Scan on
idx_retargetuserstats_makeidtag (cost=0.00..360.20 rows=13175 width=0)"*
* " Index Cond: ((makeid =
'b1mw-ferman'::text)
AND (tagged = true))"*
Otherwise, see what Craig said.
I'm assuming this isn't the query that is CPU bound for a long time. Unless
your table is horribly bloated, there's no reason for that judging by this
plan.
It is, but not always, only when there are 10K more matches. And the explain
unfortunately is sometimes way high or way low, so the expalin is hit and
miss.
But the same sql that returns maybe 500 rows is pretty fast, it's the return
of 10K+ rows that seems to stall and is CPU Bound.
Hmm - might be able to push that cross-over point up a bit by tweaking
various costs, but you've got to be careful you don't end up making all
your other queries worse. It'd be good to figure out what the problem is
first.
Looking at the query there are four stages:
1. Scan the index, build a bitmap of heap pages with matching rows
2. Scan those pages, find the rows that match
3. Run DISTINCT on the uids
4. Count them
I wonder if it could be the DISTINCT. What happens with a count(*) or
count(uid) instead? Also - you might find EXPLAIN ANALYZE more useful
than straight EXPLAIN here. That will show actual times for each stage.
On Craig's branch of this thread, you say you call it 6000 times with
different "makeid"s. Any reason why you can't join to a temp table and
just do it in one query?
--
Richard Huxton
Archonet Ltd
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance