In my mental model of how SQL works, the clauses of an SQL query (if present) are processed in a certain order: FROM WHERE SELECT GROUP BY HAVING ORDER BY LIMIT and each processes the output of the previous one. However, consider this: hjp=> select * from employees; ╔════════════╤═══════╤════════════╗ ║ ssn │ name │ other_data ║ ╟────────────┼───────┼────────────╢ ║ 1234010400 │ Alice │ (∅) ║ ║ 2345180976 │ Bob │ (∅) ║ ║ 2645101276 │ Carol │ (∅) ║ ║ 9843100395 │ David │ (∅) ║ ╚════════════╧═══════╧════════════╝ hjp=> select substring(ssn, 9, 2) as year, count(*) from employees group by year; ╔══════╤═══════╗ ║ year │ count ║ ╟──────┼───────╢ ║ 95 │ 1 ║ ║ 76 │ 2 ║ ║ 00 │ 1 ║ ╚══════╧═══════╝ (3 rows) In the GROUP BY clause I can use the alias year which was defined earlier in SELECT. HAVING comes after GROUP BY, so I should be able to use that there, too. Right? hjp=> select substring(ssn, 9, 2) as year, count(*) from employees group by year having year > '20'; ERROR: column "year" does not exist LINE 1: ...ear, count(*) from employees group by year having year > '20... Wrong. I have to use the whole expression again: hjp=> select substring(ssn, 9, 2) as year, count(*) from employees group by year having substring(ssn, 9, 2) > '20'; ╔══════╤═══════╗ ║ year │ count ║ ╟──────┼───────╢ ║ 95 │ 1 ║ ║ 76 │ 2 ║ ╚══════╧═══════╝ (2 rows) This seems inconsistent to me. Is there a technical or semantic reason for this or is just "because the standard says so". hp PS: Please no discussions about the appropriateness of using an SSN as an id. This is a completely made-up example. -- _ | Peter J. Holzer | Story must make more sense than reality. |_|_) | | | | | hjp@xxxxxx | -- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!"
Attachment:
signature.asc
Description: PGP signature