On Wed, 2023-06-21 at 11:31 -0700, Wells Oliver wrote: > Dead simple date scan across a big-ish table (est. 23,153,666 rows) > > explain analyze select count(*) from vw_pitches where game_date >= '2022-06-21' and game_date <= '2023-06-21'; > > The view does do some joins but those don't seem to be the issue to me. > > Planner does: > > Finalize Aggregate (cost=3596993.88..3596993.89 rows=1 width=8) (actual time=69980.491..69982.076 rows=1 loops=1) > -> Gather (cost=3596993.46..3596993.87 rows=4 width=8) (actual time=69979.137..69982.071 rows=5 loops=1) > Workers Planned: 4 > Workers Launched: 4 > -> Partial Aggregate (cost=3595993.46..3595993.47 rows=1 width=8) (actual time=69975.136..69975.137 rows=1 loops=5) > -> Nested Loop (cost=0.44..3591408.37 rows=1834034 width=0) (actual time=0.882..69875.934 rows=1458419 loops=5) > -> Parallel Seq Scan on pitches p (cost=0.00..3537431.89 rows=1834217 width=12) (actual time=0.852..68914.256 rows=1458419 loops=5) > Filter: ((game_date >= '2022-06-21'::date) AND (game_date <= '2023-06-21'::date)) > Rows Removed by Filter: 3212310 > -> Memoize (cost=0.44..0.47 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=7292095) > Cache Key: p.pitcher_identity_id > Cache Mode: logical > Hits: 1438004 Misses: 21042 Evictions: 0 Overflows: 0 Memory Usage: 2138kB > Worker 0: Hits: 1429638 Misses: 21010 Evictions: 0 Overflows: 0 Memory Usage: 2134kB > Worker 1: Hits: 1456755 Misses: 21435 Evictions: 0 Overflows: 0 Memory Usage: 2177kB > Worker 2: Hits: 1433557 Misses: 21201 Evictions: 0 Overflows: 0 Memory Usage: 2154kB > Worker 3: Hits: 1428727 Misses: 20726 Evictions: 0 Overflows: 0 Memory Usage: 2105kB > -> Index Only Scan using identity_pkey on identity idpitcher (cost=0.43..0.46 rows=1 width=4) (actual time=0.007..0.007 rows=1 loops=105414) > Index Cond: (identity_id = p.pitcher_identity_id) > Heap Fetches: 83 > Planning Time: 1.407 ms > Execution Time: 69982.927 ms > > Is there something to be done here? Kind of a frequent style of query and quite slow. First, make sure that PostgreSQL isn't picking the correct plan. Play with "enable_seqscan" and see if the query becomes slower. Then, make sure that "random_page_cost" is set correctly. For SSDs and other storage where random I/O is not more expensive, 1.1 or 1.0 are better values than the default 4. Finally, try disabling or reducing parallel query. PostgreSQL does not optimize for low resource usage, but for short execution time. Perhaps running the qurey with 5 processes actually is faster this way, but it might use way more resorces than a different execution plan. Yours, Laurenz Albe