Re: Union within View vs.Union of Views

Tom Lane <tgl@xxxxxxxxxxxxx> · Sat, 03 Nov 2007 21:38:35 -0400

Heikki Linnakangas <heikki@xxxxxxxxxxxxxxxx> writes:
> Jeff Larsen wrote:
>> If it's possible to consider this abstractly, is there any particular
>> reason why there is such a vast difference in performance?

> That's surprising. The planner knows how to push down WHERE conditions 
> to parts of a UNION ALL, and should be able to generate the same plan in 
> both cases.

There are a bunch of special cases where it can't do that, though.
Look into src/backend/optimizer/path/allpaths.c, particularly
subquery_is_pushdown_safe:

 * Conditions checked here:
 *
 * 1. If the subquery has a LIMIT clause, we must not push down any quals,
 * since that could change the set of rows returned.
 *
 * 2. If the subquery contains EXCEPT or EXCEPT ALL set ops we cannot push
 * quals into it, because that would change the results.
 *
 * 3. For subqueries using UNION/UNION ALL/INTERSECT/INTERSECT ALL, we can
 * push quals into each component query, but the quals can only reference
 * subquery columns that suffer no type coercions in the set operation.
 * Otherwise there are possible semantic gotchas.  So, we check the
 * component queries to see if any of them have different output types;
 * differentTypes[k] is set true if column k has different type in any
 * component.

and qual_is_pushdown_safe:

 * Conditions checked here:
 *
 * 1. The qual must not contain any subselects (mainly because I'm not sure
 * it will work correctly: sublinks will already have been transformed into
 * subplans in the qual, but not in the subquery).
 *
 * 2. The qual must not refer to the whole-row output of the subquery
 * (since there is no easy way to name that within the subquery itself).
 *
 * 3. The qual must not refer to any subquery output columns that were
 * found to have inconsistent types across a set operation tree by
 * subquery_is_pushdown_safe().
 *
 * 4. If the subquery uses DISTINCT ON, we must not push down any quals that
 * refer to non-DISTINCT output columns, because that could change the set
 * of rows returned.  This condition is vacuous for DISTINCT, because then
 * there are no non-DISTINCT output columns, but unfortunately it's fairly
 * expensive to tell the difference between DISTINCT and DISTINCT ON in the
 * parsetree representation.  It's cheaper to just make sure all the Vars
 * in the qual refer to DISTINCT columns.
 *
 * 5. We must not push down any quals that refer to subselect outputs that
 * return sets, else we'd introduce functions-returning-sets into the
 * subquery's WHERE/HAVING quals.
 *
 * 6. We must not push down any quals that refer to subselect outputs that
 * contain volatile functions, for fear of introducing strange results due
 * to multiple evaluation of a volatile function.

Idly looking at this, I'm suddenly wondering whether the prohibition on
pushing into an EXCEPT is necessary.  If a qual eliminates rows from the
EXCEPT's output, can't we just eliminate those same rows from the
inputs?

			regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq