Disjunctions and sequential scans

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi there,

This is likely me not understanding something, but I have a query that
I would expect to be fast but PG insists on using a sequential scan.
I've attached a minimized test case but I'll walk through the steps as
well.

I'm running PostgreSQL 10.1 using the standard ArchLinux packages, but
I've been able to reproduce this issue on our production systems
running 9.5 as well.

I have the following 2 tables in a standard users/addresses
configuration with an extra index on addresses to make lookups on the
referring side faster:

    CREATE TABLE users (
        id integer PRIMARY KEY
    );

    CREATE TABLE addresses (
        id integer PRIMARY KEY,
        user_id integer REFERENCES users(id)
    );

    CREATE INDEX ix_addresses_user_id ON addresses (user_id);

Also, I turn off sequential scanning to force the database to consider
any other plan first:

    SET enable_seqscan TO OFF;

Then, I would expect the following query to have a query plan without
any sequential scans:

    EXPLAIN (ANALYZE, BUFFERS)
    SELECT addresses.id
    FROM addresses
    WHERE (
        addresses.id = 1 OR
        EXISTS (
            SELECT 1 FROM users
            WHERE (
                users.id = addresses.user_id AND
                users.id = 1
            )
        )
    );

    -[ RECORD 1
]------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN | Seq Scan on addresses
(cost=10000000000.00..10000018508.10 rows=1130 width=4) (actual
time=0.001..0.001 rows=0 loops=1)
    -[ RECORD 2
]------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |   Filter: ((id = 1) OR (alternatives: SubPlan 1 or
hashed SubPlan 2))
    -[ RECORD 3
]------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |   SubPlan 1
    -[ RECORD 4
]------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |     ->  Result  (cost=0.15..8.17 rows=1 width=0)
(never executed)
    -[ RECORD 5
]------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |           One-Time Filter: (addresses.user_id = 1)
    -[ RECORD 6
]------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |           ->  Index Only Scan using users_pkey on
users  (cost=0.15..8.17 rows=1 width=0) (never executed)
    -[ RECORD 7
]------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |                 Index Cond: (id = 1)
    -[ RECORD 8
]------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |                 Heap Fetches: 0
    -[ RECORD 9
]------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |   SubPlan 2
    -[ RECORD 10
]-----------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |     ->  Index Only Scan using users_pkey on users
users_1  (cost=0.15..8.17 rows=1 width=4) (never executed)
    -[ RECORD 11
]-----------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |           Index Cond: (id = 1)
    -[ RECORD 12
]-----------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |           Heap Fetches: 0
    -[ RECORD 13
]-----------------------------------------------------------------------------------------------------------------------
    QUERY PLAN | Planning time: 0.082 ms
    -[ RECORD 14
]-----------------------------------------------------------------------------------------------------------------------
    QUERY PLAN | Execution time: 0.032 ms

Given the `Seq Scan on addresses` above, the database clearly
disagrees. What am I missing here?

Strangely, breaking down the query to its components does as I expect.
This is the primary key lookup:

    EXPLAIN (ANALYZE, BUFFERS)
    SELECT addresses.id
    FROM addresses
    WHERE addresses.id = 1;

    -[ RECORD 1
]-----------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN | Index Only Scan using addresses_pkey on addresses
(cost=0.15..8.17 rows=1 width=4) (actual time=0.008..0.008 rows=0
loops=1)
    -[ RECORD 2
]-----------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |   Index Cond: (id = 1)
    -[ RECORD 3
]-----------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |   Heap Fetches: 0
    -[ RECORD 4
]-----------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |   Buffers: shared hit=1
    -[ RECORD 5
]-----------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN | Planning time: 0.206 ms
    -[ RECORD 6
]-----------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN | Execution time: 0.031 ms

And this is the semi-join:

    EXPLAIN (ANALYZE, BUFFERS)
    SELECT addresses.id
    FROM addresses
    WHERE EXISTS (
        SELECT 1 FROM users
        WHERE (
            users.id = addresses.user_id AND
            users.id = 1
        )
    );

    -[ RECORD 1
]---------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN | Nested Loop  (cost=4.40..23.19 rows=11 width=4)
(actual time=0.007..0.007 rows=0 loops=1)
    -[ RECORD 2
]---------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |   Buffers: shared hit=1
    -[ RECORD 3
]---------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |   ->  Index Only Scan using users_pkey on users
(cost=0.15..8.17 rows=1 width=4) (actual time=0.007..0.007 rows=0
loops=1)
    -[ RECORD 4
]---------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |         Index Cond: (id = 1)
    -[ RECORD 5
]---------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |         Heap Fetches: 0
    -[ RECORD 6
]---------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |         Buffers: shared hit=1
    -[ RECORD 7
]---------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |   ->  Bitmap Heap Scan on addresses
(cost=4.24..14.91 rows=11 width=8) (never executed)
    -[ RECORD 8
]---------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |         Recheck Cond: (user_id = 1)
    -[ RECORD 9
]---------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |         ->  Bitmap Index Scan on ix_addresses_user_id
 (cost=0.00..4.24 rows=11 width=0) (never executed)
    -[ RECORD 10
]--------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN |               Index Cond: (user_id = 1)
    -[ RECORD 11
]--------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN | Planning time: 0.145 ms
    -[ RECORD 12
]--------------------------------------------------------------------------------------------------------------------------
    QUERY PLAN | Execution time: 0.038 ms

I could break the `OR` into 2 separate queries with a `UNION` but that
seems like a rather strange contortion that I would expect the
database to handle for me.

Attachment: seqscan.sql
Description: application/sql


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux