> owner@xxxxxxxxxxxxxx] On Behalf Of Thomas Munro > So what incremental improvements could we steal from "properly relational" > query languages? Here is my list of deficiencies in some or all dialects of SQL. -Relation and tuple as data types -Globally exclude NULLs and duplicate columns -Relation with no columns, null key -Natural antijoin -Tuple join operations -Tuple tests for equality, superset, subset -Tuple test for set membership of relation -Relation tests for equality, superset, subset -Relation set operations include symmetric difference -User-defined functions of arbitrary complexity -User-defined aggregation functions of arbitrary complexity -User-defined ordered queries of arbitrary complexity -Iterative/recursive queries (when) > Here's one I've thought about, trivial as it may be. I noticed that Tutorial > D (and apparently Andl too) includes a [NOT] MATCHING operator (alternatively > spelled SEMIJOIN and SEMIMINUS) corresponding to the ⋉ (semi-join) and ▷ > (anti-join) operators from relational algebra. In SQL you write [NOT] EXISTS > or [NOT] IN in the WHERE clause, rather than something explicit in a <joined > table> clause, though experienced users often talk explicitly about semi- and > anti-joins, both because of the theory and because the terms show up in query > plans. Yes, there are two joins (join and antijoin). Semijoin is just one of a number of projections following a join, but antijoin is a quite different algorithm. Antijoin is quite hard to write in SQL in such a way that the query planner will do the right thing. There is a lot of variation between dialects. > A recent blog post[1] argues that SQL should have a SQL92-style JOIN syntax > for this and points at a couple of products that have added one[2][3]. I > guess it might be hard to convince the Postgres community to add support for > a non-standard syntax that doesn't give you anything you can't already do, > but as an idea I find it interesting and it seems to be in the spirit of the > part of the Third Manifesto that says: "support[ing] the usual operators of > the relational algebra [..]. All such operators shall be expressible without > excessive circumlocution." The purpose is simply that explicit syntax allows for explicit query optimisation. > > For example, say we want all students who have one or more exam today: > > SELECT s.student_id, s.name > FROM student s > WHERE EXISTS (SELECT 1 > FROM exam e > WHERE e.student_id = s.student_id > AND e.exam_date = CURRENT_DATE) > > I don't know Tutorial D, but I think it might express that with something > like: > > ( student MATCHING exam > WHERE exam_date = CURRENT_DATE ) > { student_id, name } > > With 10 minutes of prototype hacking I convinced Postgres to accept SEMI and > ANTI like this: > > SELECT s.student_id, s.name > FROM student s SEMI JOIN exam e USING (student_id) > WHERE e.exam_date = CURRENT_DATE > > I guess a real version should accept (or require?) LEFT or RIGHT before > SEMIANTI. Andl has semijoin, rsemijoin, ajoin and rajoin. They are trivial to add once you have the basic algorithm, but Tutorial D does not. When using this hypothetical syntax I think you should be allowed > to refer to e.exam_date in the WHERE clause but not in the SELECT list > (Impala apparently does allow you to see data from exam, and returns values > from an arbitrary matching row, but that doesn't seem right to me). But the > EXISTS syntax is correspondingly strange in that it requires you to provide a > SELECT list which is entirely discarded, so people often write "*" or "1" > (erm, OK, I guess you can use an empty select list in recent Postgres). SQL has an implicit ordering of query evaluation -- you will often need to write a nested subquery or correlated query for what should be very straightforward situations. That's another thing that's easy to fix, if allowed. Regards David M Bennett FACS Andl - A New Database Language - andl.org -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general