On Mon, 2012-09-17 at 16:44 -0700, Mike Christensen wrote: > On Mon, Sep 17, 2012 at 4:32 PM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote: > > Mike Christensen <mike@xxxxxxxxxxxxx> writes: [-----------] > > It's easier to understand why this is if you realize that SQL has a very > > clear model of a "pipeline" of query execution. Conceptually, what > > happens is: > > > > 1. Form the cartesian product of the tables listed in FROM (ie, all > > combinations of rows). > > > > 2. Apply the WHERE condition to each row from 1, and drop rows that > > don't pass it. > > > > 3. If there's a GROUP BY, merge the surviving rows into groups. > > > > 4. If there's aggregate functions, compute those over the rows in > > each group. > > > > 5. If there's a HAVING, filter the grouped rows according to that. > > > > 6. Evaluate the SELECT expressions for each remaining row. > > > > 7. If there's an ORDER BY, evaluate those expressions and sort the > > remaining rows accordingly. > > > > (Obviously, implementations try to improve on this - you don't want > > to actually form the cartesian product - but that's the conceptual > > model.) > > > > The traditional shortcut of doing "ORDER BY select-column-reference" > > is okay according to this world view, because the SELECT expressions [--------------] > > are already available when ORDER BY needs them. However, it's not > > sensible to refer to SELECT outputs in WHERE, HAVING, or GROUP BY, > > because those steps precede the evaluation of the SELECT expressions. > > > > This isn't just academic nit-picking either, because the SELECT > > expressions might not be valid for rows that don't pass WHERE etc. > > Consider > > SELECT 1/x AS inverse FROM data WHERE x <> 0; > > The implementation *must* apply WHERE before computing the SELECT > > expressions, or it'll get zero-divide failures that should not happen. [-----------------] > > Excellent information, Tom! I've been somewhat curious on this > behavior for some time now, and it's great to get a detailed answer.. > Yes. But it puzzles me, if it *conceptually* would be a signifficant misstake, when what Tom calls "select-column-reference" (I understand as: the colunm name introduced on the select-list), would actually be regarded by the SQL parser as "macro-definition". Just to place the *string*, defined at select-list-level by "AS <name>", to wherever it's used in the WHERE/ORDER/GROUP or HAVING clauses at earlier then select-list-evaluation processing stage. Actual Tom's example(1): SELECT 1/x AS inverse FROM data WHERE x <> 0; extended to (2): SELECT 1/x AS inverse FROM data WHERE x <> 0 AND 1/x > 20; could be written by user as (3): SELECT 1/x AS inverse FROM data WHERE x <> 0 AND inverse > 20; but token/replaced to its form (2) before WHERE evaluation. -R -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general