Re: query planner placement of sort/limit w.r.t. joins

David Rowley <david.rowley@xxxxxxxxxxxxxxx> · Sat, 29 Apr 2017 10:24:57 +1200

On 29 April 2017 at 07:59, Dave Vitek <dvitek@xxxxxxxxxxxxxx> wrote:
> Is what I want in the query planner's vocabulary?  It would need to exploit
> the fact that the _id columns are not nullable, and either exploit the
> uniqueness of the id columns or do an extra LIMIT step after the join.  I
> think I want it to effectively give the same result I expect (haven't
> tested) it would give for:

Unfortunately, it's not a plan that the current planner will consider.

> SELECT D.val, B.val, C.val FROM
>    (SELECT * FROM A ORDER BY A.rank LIMIT 100) AS D
>    JOIN B ON D.b_id = B.id
>    JOIN C ON B.c_id = C.id
>    LIMIT 100;
>
> Perhaps there are reasons why this optimization is not safe that I haven't
> thought about?

Yeah, I think so. What happens if an A row cannot find a match in B or
C? This version of the query will end up returning fewer rows due to
that, but the original version would consider other rows with a higher
rank.

We've danced around a bit with using foreign keys as proofs that rows
will exist for other optimisations in the past, but it's tricky ground
since foreign keys are not updated immediately, so there are windows
where they may not actually hold true to their word.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general