"HashAggregate (cost=156301.82..156301.83 rows=2 width=26) (actual time=2692.806..2692.807 rows=2 loops=1)"
" -> Bitmap Heap Scan on blocks (cost=14810.54..155828.95 rows=472871 width=26) (actual time=289.828..1593.893 rows=575186 loops=1)"
" Recheck Cond: (created > '2012-01-29 00:00:00+00'::timestamp with time zone)"
" Filter: (shared IS FALSE)"
" -> Bitmap Index Scan on blocks_created_idx (cost=0.00..14786.89 rows=550404 width=0) (actual time=277.407..277.407 rows=706663 loops=1)"
" Index Cond: (created > '2012-01-29 00:00:00+00'::timestamp with time zone)"
"Total runtime: 2693.107 ms"
To answer your (non-)question about Heroku, it's a cloud service, so I don't host PostgreSQL myself. I'm not sure how much I can mess with things like GUC since I don't even have access to the "postgres" database on the server. I am a long time SQL user but new to Postgres so I welcome suggestions on where to start with that sort of thing. Setting enable_seqscan=false made a huge difference, so I think I'll start there.
Thank you very much!
-Alessandro
On Mon, Jan 30, 2012 at 11:24 AM, Claudio Freire <klaussfreire@xxxxxxxxx> wrote:
On Mon, Jan 30, 2012 at 4:13 PM, Alessandro GagliardiThe number of rows touched is ~0.5M, and is correctly estimated, which
<alessandro@xxxxxxxx> wrote:
> So, here's the query:
>
> SELECT private, COUNT(block_id) FROM blocks WHERE created > 'yesterday' AND
> shared IS FALSE GROUP BY private
>
> What confuses me is that though this is a largish table (millions of rows)
> with constant writes, the query is over indexed columns of types timestamp
> and boolean so I would expect it to be very fast. The clause where created >
> 'yesterday' is there mostly to speed it up, but apparently it doesn't help
> much.
would lead me to believe PG estimates the index plan to be slower.
You could try by executing first "set enable_seqscan=false;" and then
your query with explain analyze again. You'll probably get an index
scan, and you'll see both how it performs and how PG thought it would
perform. Any mismatch between the two probably means you'll have to
change the planner tunables (the x_tuple_cost ones) to better match
your hardware.
snip
> As for Hardware: I'm using Heroku's "Ronin" setup which involves 1.7 GB
> Cache. Beyond that I don't really know.
> As for GUC Settings: Again, I don't know what this is. Whatever HerokuAnd there's your problem. Without knowing/understanding those, you
> defaults to is what I'm using.
won't get anywhere. I don't know what Heroku is, but you should find
out both hardware details and PG configuration details.
Maintainance also includes analyzing the table, to gather stats that
> As for Maintenance Setup: I let Heroku handle that, so I again, I don't
> really know. FWIW though, vacuuming should not really be an issue (as I
> understand it) since I don't really do any updates or deletions. It's pretty
> much all inserts and selects.
feed the optimizer, and it's very important to keep the stats
accurate. You can do it manually - just perform an ANALYZE. However,
the plan doesn't show any serious mismatch between expected and actual
rowcounts, which suggests stats aren't your problem.