Hello Jeff,
Sorry for the delay, here are the EXPLAIN ANALYSE results for one single row in the inner-query:
Nested Loop Semi Join (cost=10000000993.81..10004731160.70 rows=536206 width=28) (actual time=93765.182..93765.183 rows=0 loops=1)
Output: fu.w2_page_idxs
Join Filter: (fu.w2_page_idxs && (ARRAY[fact_pages.idx]))
Rows Removed by Join Filter: 53762825
Buffers: shared hit=569194 read=2821768
I/O Timings: read=56586.955
-> Seq Scan on public.fact_users fu (cost=10000000000.00..10003925857.68 rows=53620568 width=28) (actual time=79.139..67423.779 rows=53762825 loops=1)
Output: fu.w2_page_idxs
Buffers: shared hit=567884 read=2821768
I/O Timings: read=56586.955
-> Materialize (cost=993.81..994.50 rows=1 width=32) (actual time=0.000..0.000 rows=1 loops=53762825)
Output: (ARRAY[fact_pages.idx])
Buffers: shared hit=148
-> Limit (cost=993.81..994.48 rows=1 width=32) (actual time=26.382..26.383 rows=1 loops=1)
Output: (ARRAY[fact_pages.idx])
Buffers: shared hit=148
-> Bitmap Heap Scan on public.fact_pages (cost=993.81..70645.00 rows=103556 width=32) (actual time=26.378..26.379 rows=1 loops=1)
Output: ARRAY[fact_pages.idx]
Recheck Cond: (fact_pages.attribute_idxs && '{300000160}'::integer[])
Heap Blocks: exact=1
Buffers: shared hit=148
-> Bitmap Index Scan on fact_pages_attribute_idxs_int (cost=0.00..967.92 rows=103556 width=0) (actual time=14.865..14.865 rows=101462 loops=1)
Index Cond: (fact_pages.attribute_idxs && '{300000160}'::integer[])
Buffers: shared hit=147
Query Identifier: 6779965332684941204
Planning:
Buffers: shared hit=2
Planning Time: 0.162 ms
JIT:
Functions: 10
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 1.507 ms, Inlining 9.797 ms, Optimization 54.902 ms, Emission 14.314 ms, Total 80.521 ms
Execution Time: 93766.772 ms
Query:
EXPLAIN (
ANALYZE,
VERBOSE,
COSTS,
BUFFERS,
TIMING
)
SELECT
fu.w2_page_idxs
FROM
fact_users
AS fu
WHERE
EXISTS (
SELECT
FROM
(
SELECT
ARRAY[idx] AS page_idx
FROM
fact_pages
WHERE
attribute_idxs && ARRAY[300000160]
FETCH FIRST 1 ROWS ONLY
)
AS fp
WHERE
fu.w2_page_idxs && fp.page_idx
)
;
Without any surprises, the planner is using a sequential scan on the "fact_users" table which is very large instead of using the GIN index set on the "w2_page_idxs" column.
Link to the query plan visualiser: https://explain.dalibo.com/plan/1vC
Thank you very much in advance,
Mickael
On Wed, Apr 27, 2022 at 4:54 PM Mickael van der Beek <mickael.van.der.beek@xxxxxxxxx> wrote:
Hello Jeff,I have waited a few hours without the query ever finishing which is the reason I said "never finishes".Especially because the INNER JOIN version finishes within a few minutes while being combinatorial and less efficient.The query probably only does sequential scans.You will find the query plan using EXPLAIN here:- Visual query plan: https://explain.dalibo.com/plan#plan- Raw query plan: https://explain.dalibo.com/plan#rawThanks for your help,MickaelOn Wed, Apr 27, 2022 at 4:28 PM Jeff Janes <jeff.janes@xxxxxxxxx> wrote:On Wed, Apr 27, 2022 at 8:19 AM Mickael van der Beek <mickael.van.der.beek@xxxxxxxxx> wrote:The last query does not finish after waiting for more than 15 minutes.
(The temporary view creation is very fast and required due to the same query in a CTE greatly reducing performance (by more than 5 min.) due to the optimisation barrier I'm guessing.)How much over 15 minutes? 20 minutes doesn't seem that long to wait to get a likely definitive answer. But at the least show us the EXPLAIN without ANALYZE of it, that should take no milliseconds.And what does it mean for something to take 5 minutes longer than "never finishes"?(Also, putting every or every other token on a separate line does not make it easier to read)Cheer,Jeff--