On 10/30/2012 12:33 AM, Peter Geoghegan wrote: > On 29 October 2012 16:00, Jeff Janes <jeff.janes@xxxxxxxxx> wrote: >> The re-writing does not occur at the level of SQL, but rather at the >> level of an internal representation "parse tree". > > Right. The post parse-analysis tree (i.e. the Query struct) is > potentially rewritten by an explicit rewrite stage after > parse-analysis but immediately before planning. No rewriting of plans > actually occurs. This is part of the rules subsystem, which is > generally considered to be a foot-gun. It *is* mostly just as if the > query had been silently rewritten, and had its SQL magically changed. > > Craig mentioned that two distinct queries can produce the same query > plan. That's true, if a little academic. I guess the best example of > that is with join syntax. Technically, these could be two distinct > queries, in that the query trees would be substantively different > (according to pg_stat_statements own definition of that, which is > essentially that they're not bitwise identical in respect of their > nodes' substantive fields): > > select * from foo f join bar b on f.bid = b.bid; > > select * from foo f, bar b where f.bid = b.bid; > > However, after planning, they could indeed have identical plans. I'm talking about more substantial transformations too. For example, given sample data: CREATE TABLE a ( id integer primary key, somedata text not null ); INSERT INTO a(id, somedata) VALUES (1,'ham'),(2,'eggs'),(3,'spam'),(4,'putty'); CREATE TABLE b ( a_id integer not null references a(id)); INSERT INTO b VALUES (1),(3); regress=> EXPLAIN SELECT somedata FROM a WHERE EXISTS (SELECT 1 FROM b WHERE b.a_id = a.id); QUERY PLAN ----------------------------------------------------------------------- Hash Join (cost=44.50..73.41 rows=615 width=32) Hash Cond: (a.id = b.a_id) -> Seq Scan on a (cost=0.00..22.30 rows=1230 width=36) -> Hash (cost=42.00..42.00 rows=200 width=4) -> HashAggregate (cost=40.00..42.00 rows=200 width=4) -> Seq Scan on b (cost=0.00..34.00 rows=2400 width=4) (6 rows) regress=> EXPLAIN SELECT somedata FROM a WHERE id IN (SELECT a_id FROM b); QUERY PLAN ----------------------------------------------------------------------- Hash Join (cost=44.50..73.41 rows=615 width=32) Hash Cond: (a.id = b.a_id) -> Seq Scan on a (cost=0.00..22.30 rows=1230 width=36) -> Hash (cost=42.00..42.00 rows=200 width=4) -> HashAggregate (cost=40.00..42.00 rows=200 width=4) -> Seq Scan on b (cost=0.00..34.00 rows=2400 width=4) (6 rows) or: ALTER TABLE b ADD UNIQUE(a_id); regress=> EXPLAIN SELECT somedata FROM a INNER JOIN b ON (a.id = b.a_id); QUERY PLAN ----------------------------------------------------------------------- Nested Loop (cost=0.00..17.58 rows=2 width=32) -> Seq Scan on b (cost=0.00..1.02 rows=2 width=4) -> Index Scan using a_pkey on a (cost=0.00..8.27 rows=1 width=36) Index Cond: (id = b.a_id) (4 rows) regress=> EXPLAIN SELECT somedata FROM a WHERE id IN (SELECT a_id FROM b); QUERY PLAN ----------------------------------------------------------------------- Nested Loop (cost=0.00..17.58 rows=2 width=32) -> Seq Scan on b (cost=0.00..1.02 rows=2 width=4) -> Index Scan using a_pkey on a (cost=0.00..8.27 rows=1 width=36) Index Cond: (id = b.a_id) (4 rows) regress=> EXPLAIN SELECT somedata FROM a WHERE EXISTS (SELECT 1 FROM b WHERE b.a_id = a.id); QUERY PLAN ----------------------------------------------------------------------- Nested Loop (cost=0.00..17.58 rows=2 width=32) -> Seq Scan on b (cost=0.00..1.02 rows=2 width=4) -> Index Scan using a_pkey on a (cost=0.00..8.27 rows=1 width=36) Index Cond: (id = b.a_id) (4 rows) These are three very different ways to write the query, and they all result in the same plan. -- Craig Ringer -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general