On Mon, Feb 27, 2006 at 18:34:16 +0300, Nikolay Samokhvalov <samokhvalov@xxxxxxxxx> wrote: > On 2/27/06, Bruno Wolff III <bruno@xxxxxxxx> wrote: > > The alternatives to distinct on are painful. They are generally both harder > > to read and run slower. > > > > 'DISTINCT ON' is evil constuction, because (w/o any 'ORDER BY') it > produses unpredictable result, as 'ORDER BY random()' does. > > When newbie types 'random()', he understands what he is doing, but > it's not the case for 'DISTINCT ON' and can lead to mistakes. The documentation for DISTINCT ON is clear about this: DISTINCT ON ( expression [, ...] ) keeps only the first row of each set of rows where the given expressions evaluate to equal. The DISTINCT ON expressions are interpreted using the same rules as for ORDER BY (see above). Note that the "first row" of each set is unpredictable unless ORDER BY is used to ensure that the desired row appears first. For example, I doubt the newbie understands random() as well as DISTINCT ON on the first go around. In some uses random() is evaluated per tuple and in others it isn't. In particular it probably isn't obvious to newbies what the semantics of the following is: SELECT a FROM b WHERE a > random();