Postgres not using GiST index in a lateral join

Igor Stassiy <istassiy@xxxxxxxxx> · Wed, 04 Mar 2015 14:36:56 +0000

Hello,
I have a query plan optimization question. It is formatted nicely on 

http://stackoverflow.com/questions/28856452/postgres-not-using-gist-index-in-lateral-join

But here is a copy for the archive:

Here is the setup:
CREATE EXTENSION postgis;
DROP TABLE IF EXISTS A;
DROP TABLE IF EXISTS B; 
CREATE TABLE A(shape Geometry, id INT); 
CREATE TABLE B(shape Geometry, id INT, kind INT); 
CREATE INDEX ON A USING GIST (shape); 
CREATE INDEX ON B USING GIST (shape);

I am running the following commands:

ANALYZE A; 
ANALYZE B;

-- for each row in A, select exactly one row in B (if there is one)
-- such that B contains geometry of A
EXPLAIN
 SELECT * FROM (SELECT A.id as aid, (SELECT B.id FROM B WHERE
 ST_Contains(B.shape, A.shape)
 AND B.kind != 1 LIMIT 1) as bid FROM A) AS TMP;

which gives me {
 "Plan": {
 "Node Type": "Seq Scan",
 "Relation Name": "A",
 "Startup Cost": 0.00,
 "Total Cost": 2606592.33,
 "Plan Rows": 549745,
 "Plan Width": 1646,
 "Plans": [
 {
 "Node Type": "Limit",
 "Parent Relationship": "SubPlan",
 "Subplan Name": "SubPlan 1",
 "Startup Cost": 0.00,
 "Total Cost": 4.68,
 "Plan Rows": 1,
 "Plan Width": 8,
 "Plans": [
 {
 "Node Type": "Seq Scan",
 "Parent Relationship": "Outer",
 "Relation Name": "B",
 "Startup Cost": 0.00,
 "Total Cost": 4.68,
 "Plan Rows": 1,
 "Plan Width": 8,
 "Filter": "((shape && A.shape) AND _st_contains(shape, A.shape))"
 }
 ]
 }
 ]
 }
 }

Note that there is a sequential scan inside the lateral join, however there is clearly an index available. However after setting

set enable_seqscan=false;

the index is being used. This actually affects runtime significantly (around 3 times faster) and seems that postgres should figure things like that automatically. {
 "Plan": {
 "Node Type": "Seq Scan",
 "Relation Name": "A",
 "Startup Cost": 10000000000.00,
 "Total Cost": 10004716493.85,
 "Plan Rows": 549450,
 "Plan Width": 1677,
 "Plans": [
 {
 "Node Type": "Limit",
 "Parent Relationship": "SubPlan",
 "Subplan Name": "SubPlan 1",
 "Startup Cost": 0.00,
 "Total Cost": 8.52,
 "Plan Rows": 1,
 "Plan Width": 8,
 "Plans": [
 {
 "Node Type": "Index Scan",
 "Parent Relationship": "Outer",
 "Scan Direction": "NoMovement",
 "Index Name": "B_shape_idx",
 "Relation Name": "B",
 "Startup Cost": 0.00,
 "Total Cost": 8.52,
 "Plan Rows": 1,
 "Plan Width": 8,
 "Index Cond": "(shape && A.shape)",
 "Filter": "_st_contains(shape, A.shape)"
 }
 ]
 }
 ]
 }
 }

Is there any way to tell postgres to use index in a less hacky way? Possibly by rewriting the query? From what I understand the use of set enable_... is not recommended in production.

When you actually run the commands above it will give

{
 "Plan": {
 "Node Type": "Seq Scan",
 "Relation Name": "a",
 "Alias": "a",
 "Startup Cost": 0.00,
 "Total Cost": 10372.75,
 "Plan Rows": 1230,
 "Plan Width": 36,
 "Plans": [
 {
 "Node Type": "Limit",
 "Parent Relationship": "SubPlan",
 "Subplan Name": "SubPlan 1",
 "Startup Cost": 0.14,
 "Total Cost": 8.41,
 "Plan Rows": 1,
 "Plan Width": 4,
 "Plans": [
 {
 "Node Type": "Index Scan",
 "Parent Relationship": "Outer",
 "Scan Direction": "NoMovement",
 "Index Name": "b_shape_idx",
 "Relation Name": "b",
 "Alias": "b",
 "Startup Cost": 0.14,
 "Total Cost": 8.41,
 "Plan Rows": 1,
 "Plan Width": 4,
 "Index Cond": "(shape && a.shape)",
 "Filter": "((kind <> 1) AND _st_contains(shape, a.shape))"
 }
 ]
 }
 ]
 }
 }

Unfortunately I cannot provide data to reproduce the query plan results.

Thanks,
Igor