Jay Levitt wrote:
And yep! When I do a CREATE TABLE AS from that view, and add an index on
user_id, it works just as I'd like.
Or not. Feel free to kick me back over to pgsql-novice, but I don't get why
the GROUP BY in this subquery forces it to scan the entire users table (seq
scan here, index scan on a larger table) when there's only one row in users
that can match:
create table questions (
id int not null primary key,
user_id int not null
);
insert into questions
select generate_series(1,1100), (random()*2000)::int;
create table users (
id int not null primary key
);
insert into users select generate_series(1, 2000);
vacuum freeze analyze;
explain analyze
select questions.id
from questions
join (
select u.id
from users as u
group by u.id
) as s
on s.id = questions.user_id
where questions.id = 1;
Hash Join (cost=42.28..89.80 rows=2 width=4) (actual time=0.857..1.208
rows=1 loops=1)
Hash Cond: (u.id = questions.user_id)
-> HashAggregate (cost=34.00..54.00 rows=2000 width=4) (actual
time=0.763..1.005 rows=2000 loops=1)
-> Seq Scan on users u (cost=0.00..29.00 rows=2000 width=4)
(actual time=0.003..0.160 rows=2000 loops=1)
-> Hash (cost=8.27..8.27 rows=1 width=8) (actual time=0.015..0.015
rows=1 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Index Scan using questions_pkey on questions (cost=0.00..8.27
rows=1 width=8) (actual time=0.012..0.013 rows=1 loops=1)
Index Cond: (id = 1)
Total runtime: 1.262 ms
This is on patched 9.0.5 built earlier today. The real query has
aggregates, so it really does need GROUP BY.. I think..
Jay
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance