Re: RFE: Column aliases in WHERE clauses

Rafal Pietrak <rafal@xxxxxxxxxxxxxxxxxx> · Tue, 18 Sep 2012 10:29:36 +0200

On Mon, 2012-09-17 at 16:44 -0700, Mike Christensen wrote:
> On Mon, Sep 17, 2012 at 4:32 PM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
> > Mike Christensen <mike@xxxxxxxxxxxxx> writes:

[-----------]

> > It's easier to understand why this is if you realize that SQL has a very
> > clear model of a "pipeline" of query execution.  Conceptually, what
> > happens is:
> >
> > 1. Form the cartesian product of the tables listed in FROM (ie, all
> > combinations of rows).
> >
> > 2. Apply the WHERE condition to each row from 1, and drop rows that
> > don't pass it.
> >
> > 3. If there's a GROUP BY, merge the surviving rows into groups.
> >
> > 4. If there's aggregate functions, compute those over the rows in
> > each group.
> >
> > 5. If there's a HAVING, filter the grouped rows according to that.
> >
> > 6. Evaluate the SELECT expressions for each remaining row.
> >
> > 7. If there's an ORDER BY, evaluate those expressions and sort the
> > remaining rows accordingly.
> >
> > (Obviously, implementations try to improve on this - you don't want
> > to actually form the cartesian product - but that's the conceptual
> > model.)
> >
> > The traditional shortcut of doing "ORDER BY select-column-reference"
> > is okay according to this world view, because the SELECT expressions

[--------------]

> > are already available when ORDER BY needs them.  However, it's not
> > sensible to refer to SELECT outputs in WHERE, HAVING, or GROUP BY,
> > because those steps precede the evaluation of the SELECT expressions.
> >
> > This isn't just academic nit-picking either, because the SELECT
> > expressions might not be valid for rows that don't pass WHERE etc.
> > Consider
> >         SELECT 1/x AS inverse FROM data WHERE x <> 0;
> > The implementation *must* apply WHERE before computing the SELECT
> > expressions, or it'll get zero-divide failures that should not happen.

[-----------------]

> 
> Excellent information, Tom!  I've been somewhat curious on this
> behavior for some time now, and it's great to get a detailed answer..
> 

Yes.

But it puzzles me, if it *conceptually* would be a signifficant
misstake, when what Tom calls "select-column-reference" (I understand
as: the colunm name introduced on the select-list), would actually be
regarded by the SQL parser as "macro-definition". Just to place the
*string*, defined at select-list-level by "AS <name>", to wherever it's
used in the WHERE/ORDER/GROUP or HAVING clauses at earlier then
select-list-evaluation processing stage.

Actual Tom's example(1):
	SELECT 1/x AS inverse FROM data WHERE x <> 0;
extended to (2):
	SELECT 1/x AS inverse FROM data WHERE x <> 0 AND 1/x > 20;
could be written by user as (3):
	SELECT 1/x AS inverse FROM data WHERE x <> 0 AND inverse > 20;
but token/replaced to its form (2) before WHERE evaluation.

-R

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general