Search Postgresql Archives

Poor performance on 9.4.4 when matching multiple columns from left side of LATERAL join

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We have a table, call it "multi_id", that contains columns with IDs of various kinds of objects in my system, and another table that's a generic owner/key/value store for object attributes (think configuration settings, and I'll refer to this table as "settings"). To wit:

---------------------------------------------
CREATE TABLE multi_id (
  id1 INTEGER PRIMARY KEY,
  id2 INTEGER,
  id3 INTEGER
);
CREATE TABLE settings (
  owner_id INTEGER,
  setting_id INTEGER,
  setting_value TEXT,
  PRIMARY KEY (owner_id, setting_id)
);
CREATE UNIQUE INDEX multi_id_idx_id1 ON multi_id (id1, id2);
CREATE UNIQUE INDEX multi_id_idx_id2 ON multi_id (id2, id1);
CREATE INDEX settings_idx_setting_id ON settings (setting_id, setting_value);
---------------------------------------------

We want to find all the rows from multi_id where any of the IDs (including its primary key) have a certain setting with a certain value.

LATERAL seemed like the tool for the job, so we tried the following:

---------------------------------------------
SELECT mid.id1
FROM multi_id AS mid,
LATERAL (
    SELECT 1
    FROM settings
    WHERE setting_id = 1
    AND setting_value = 'common_1'
    AND owner_id IN (mid.id1, mid.id2, mid.id3)
) AS setting_matcher;
---------------------------------------------

When we're searching for a common value, this query takes a LONG time. It turns out the culprit is the IN clause in the subquery. If I change "owner_id IN (mid.id1, mid.id2, mid.id3)" to "owner_id = mid.id1", the query executes in about 1/900 the time. It remains that fast if I change mid.id1 to mid.id2 or mid.id3, meaning if I do a UNION of those three queries to get the same result set as the query above, the whole thing is roughly 300x faster.

Execution plan for the IN version followed by the = version (for just one of the IDs):

---------------------------------------------
Nested Loop (cost=5.39..8107.18 rows=285 width=4) (actual time=1.230..6456.567 rows=4499 loops=1) Join Filter: (settings.owner_id = ANY (ARRAY[mid.id1, mid.id2, mid.id3]))
   Rows Removed by Join Filter: 22495501
-> Seq Scan on multi_id mid (cost=0.00..78.00 rows=5000 width=12) (actual time=0.010..1.385 rows=5000 loops=1) -> Materialize (cost=5.39..310.66 rows=95 width=4) (actual time=0.000..0.263 rows=4500 loops=5000) -> Bitmap Heap Scan on settings (cost=5.39..310.19 rows=95 width=4) (actual time=1.207..3.210 rows=4500 loops=1) Recheck Cond: ((setting_id = 1) AND (setting_value = 'common_1'::text))
               Heap Blocks: exact=1405
-> Bitmap Index Scan on settings_idx_setting_id (cost=0.00..5.37 rows=95 width=0) (actual time=0.930..0.930 rows=4500 loops=1) Index Cond: ((setting_id = 1) AND (setting_value = 'common_1'::text))
 Planning time: 0.178 ms
 Execution time: 6456.897 ms


Hash Join (cost=145.98..472.93 rows=103 width=4) (actual time=2.677..6.890 rows=4500 loops=1)
   Hash Cond: (settings.owner_id = mid.id1)
-> Bitmap Heap Scan on settings (cost=5.48..330.50 rows=103 width=4) (actual time=1.194..3.477 rows=4500 loops=1) Recheck Cond: ((setting_id = 1) AND (setting_value = 'common_1'::text))
         Heap Blocks: exact=1405
-> Bitmap Index Scan on settings_idx_setting_id (cost=0.00..5.45 rows=103 width=0) (actual time=0.854..0.854 rows=4500 loops=1) Index Cond: ((setting_id = 1) AND (setting_value = 'common_1'::text)) -> Hash (cost=78.00..78.00 rows=5000 width=4) (actual time=1.463..1.463 rows=5000 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 176kB
-> Seq Scan on multi_id mid (cost=0.00..78.00 rows=5000 width=4) (actual time=0.007..0.717 rows=5000 loops=1)
 Planning time: 0.311 ms
 Execution time: 7.166 ms
---------------------------------------------

What am I doing wrong in the IN version of the query, if anything?

I wrote a script to populate a test database with a simplified version of our real data model and demonstrate the behavior I'm seeing: https://gist.github.com/sgrimm-sg/2722068ef844d3e02129

Thanks!



--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux