I would like to stop executing the query for a row of table "a" when a single row of "b" is found. This query would not stop
processing but will filter all the rows that are found at the end of execution.
processing but will filter all the rows that are found at the end of execution.
Is there a way to express this without a subquery?
On Wed, Mar 4, 2015 at 11:53 PM Paul Ramsey <pramsey@xxxxxxxxxxxxxxxxx> wrote:
Stop writing so many subqueries, think in joins; the poor planner!
SELECT DISTINCT ON (a.id) a.id AS a_id, b.id AS b_id
FROM a
JOIN b
ON ST_Contains(b.shape, a.shape)
WHERE b.kind != 1
Also, the DISTINCT ON syntax (PgSQL custom) lets you winnow a result
set down to just one of the inputs.
P.
On Wed, Mar 4, 2015 at 6:36 AM, Igor Stassiy <istassiy@xxxxxxxxx> wrote:
> Hello,
>
> I have a query plan optimization question. It is formatted nicely on
>
> http://stackoverflow.com/questions/28856452/postgres-not-using-gist-index-in-lateral-join
>
> But here is a copy for the archive:
>
> Here is the setup:
>
> CREATE EXTENSION postgis;
> DROP TABLE IF EXISTS A;
> DROP TABLE IF EXISTS B;
> CREATE TABLE A(shape Geometry, id INT);
> CREATE TABLE B(shape Geometry, id INT, kind INT);
> CREATE INDEX ON A USING GIST (shape);
> CREATE INDEX ON B USING GIST (shape);
>
> I am running the following commands:
>
> ANALYZE A;
> ANALYZE B;
>
> -- for each row in A, select exactly one row in B (if there is one)
> -- such that B contains geometry of A
> EXPLAIN SELECT * FROM (SELECT A.id as aid, (SELECT B.id FROM B WHERE
> ST_Contains(B.shape, A.shape) AND B.kind != 1 LIMIT 1) as bid FROM A) AS
> TMP;
>
> which gives me { "Plan": { "Node Type": "Seq Scan", "Relation Name": "A",
> "Startup Cost": 0.00, "Total Cost": 2606592.33, "Plan Rows": 549745, "Plan
> Width": 1646, "Plans": [ { "Node Type": "Limit", "Parent Relationship":
> "SubPlan", "Subplan Name": "SubPlan 1", "Startup Cost": 0.00, "Total Cost":
> 4.68, "Plan Rows": 1, "Plan Width": 8, "Plans": [ { "Node Type": "Seq Scan",
> "Parent Relationship": "Outer", "Relation Name": "B", "Startup Cost": 0.00,
> "Total Cost": 4.68, "Plan Rows": 1, "Plan Width": 8, "Filter": "((shape &&
> A.shape) AND _st_contains(shape, A.shape))" } ] } ] } }
>
>
> Note that there is a sequential scan inside the lateral join, however there
> is clearly an index available. However after setting
>
> set enable_seqscan=false;
>
> the index is being used. This actually affects runtime significantly (around
> 3 times faster) and seems that postgres should figure things like that
> automatically. { "Plan": { "Node Type": "Seq Scan", "Relation Name": "A",
> "Startup Cost": 10000000000.00, "Total Cost": 10004716493.85, "Plan Rows":
> 549450, "Plan Width": 1677, "Plans": [ { "Node Type": "Limit", "Parent
> Relationship": "SubPlan", "Subplan Name": "SubPlan 1", "Startup Cost": 0.00,
> "Total Cost": 8.52, "Plan Rows": 1, "Plan Width": 8, "Plans": [ { "Node
> Type": "Index Scan", "Parent Relationship": "Outer", "Scan Direction":
> "NoMovement", "Index Name": "B_shape_idx", "Relation Name": "B", "Startup
> Cost": 0.00, "Total Cost": 8.52, "Plan Rows": 1, "Plan Width": 8, "Index
> Cond": "(shape && A.shape)", "Filter": "_st_contains(shape, A.shape)" } ] }
> ] } }
>
> Is there any way to tell postgres to use index in a less hacky way? Possibly
> by rewriting the query? From what I understand the use of set enable_... is
> not recommended in production.
>
> When you actually run the commands above it will give
>
> { "Plan": { "Node Type": "Seq Scan", "Relation Name": "a", "Alias": "a",
> "Startup Cost": 0.00, "Total Cost": 10372.75, "Plan Rows": 1230, "Plan
> Width": 36, "Plans": [ { "Node Type": "Limit", "Parent Relationship":
> "SubPlan", "Subplan Name": "SubPlan 1", "Startup Cost": 0.14, "Total Cost":
> 8.41, "Plan Rows": 1, "Plan Width": 4, "Plans": [ { "Node Type": "Index
> Scan", "Parent Relationship": "Outer", "Scan Direction": "NoMovement",
> "Index Name": "b_shape_idx", "Relation Name": "b", "Alias": "b", "Startup
> Cost": 0.14, "Total Cost": 8.41, "Plan Rows": 1, "Plan Width": 4, "Index
> Cond": "(shape && a.shape)", "Filter": "((kind <> 1) AND _st_contains(shape,
> a.shape))" } ] } ] } }
>
> Unfortunately I cannot provide data to reproduce the query plan results.
>
> Thanks,
> Igor