On Nov 10, 2008, at 1:31 PM, Tom Lane wrote:
"Jim 'Decibel!' Nasby" <jnasby@xxxxxxxxxxxxxx> writes:
Here's the commands to generate the test case:
create table a(a int, b text default 'test text');
create table c(c_id serial primary key, c_text text);
insert into c(c_text) values('a'),('b'),('c');
create table b(a int, c_id int references c(c_id));
create view v as select a, b, null as c_id, null as c_text from a
union all select a, null, b.c_id, c_text from b join c on (b.c_id=
c.c_id);
\timing
insert into a(a) select generate_series(1,9999999);
select count(*) from a;
select count(*) from v;
explain analyze select count(*) from a;
explain analyze select count(*) from v;
I think what you're looking at is projection overhead and per-plan-
node
overhead (the EXPLAIN ANALYZE in itself contributes quite a lot of the
latter).
True... under HEAD explain took 13 seconds while a plain count took
10. Still not very good considering the count from the raw table took
about 4 seconds (with or without explain).
One thing you could do is be more careful about making the
union input types match up so that no subquery scan nodes are
required:
create view v2 as select a, b, null::int as c_id, null::text as
c_text from a
union all select a, null::text, b.c_id, c_text from b join c on
(b.c_id=c.c_id);
On my machine this runs about twice as fast as the original view.
Am I missing some magic? I'm still getting the subquery scan.
decibel@xxxxxxxxxxxxx=# explain select count(*) from v2;
QUERY PLAN
------------------------------------------------------------------------
--------------
Aggregate (cost=279184.19..279184.20 rows=1 width=0)
-> Append (cost=0.00..254178.40 rows=10002315 width=0)
-> Subquery Scan "*SELECT* 1" (cost=0.00..254058.50
rows=10000175 width=0)
-> Seq Scan on a (cost=0.00..154056.75
rows=10000175 width=14)
-> Subquery Scan "*SELECT* 2" (cost=37.67..119.90
rows=2140 width=0)
-> Hash Join (cost=37.67..98.50 rows=2140 width=40)
Hash Cond: (b.c_id = c.c_id)
-> Seq Scan on b (cost=0.00..31.40 rows=2140
width=8)
-> Hash (cost=22.30..22.30 rows=1230 width=36)
-> Seq Scan on c (cost=0.00..22.30
rows=1230 width=36)
(10 rows)
Time: 0.735 ms
decibel@xxxxxxxxxxxxx=# \d v2
View "public.v2"
Column | Type | Modifiers
--------+---------+-----------
a | integer |
b | text |
c_id | integer |
c_text | text |
View definition:
SELECT a.a, a.b, NULL::integer AS c_id, NULL::text AS c_text
FROM a
UNION ALL
SELECT b.a, NULL::text AS b, b.c_id, c.c_text
FROM b
JOIN c ON b.c_id = c.c_id;
That's on HEAD, btw.
--
Decibel! jnasby@xxxxxxxxxxxxxx (512) 569-9461
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance