> t > '2020-01-01'::timestamp and t < '2021-01-01'::timestamp
>Not at all important, but it seems odd to be exclusive of the start and end both. I would >consider including the start with >=
>Michael Lewis | Database Engineer
>Entrata
Michael, funny I was thinking that myself minutes after posting. Perhaps it is that tiny gap that makes a difference; however changing it to t >= '2020....etc' and perfectly matching the partition range, did not change anything of significance in the explain or runtime. :-|
On that other topic, n_distinct, it is for the moment indeed hardcoded to -0,1. I have tried to reset n_distinct, and run analyze with default_target_statistics = 2000; no dice!
However, the cars_ref in question, is present in the most_common_vals of pg_stats, and according to that frequency array, that value occurs with a frequency of 1,7%. That seems correct.
select count(*)
from bigtablet
where cars_ref = 1769854207
and t >= '2020-01-01'::timestamp and t < '2021-01-01'::timestamp;
--> 2 817 169
I can add that car_ref in general is quite skewed in its distribution, but I don't think that is the issue here.
I think the key hint is that when targeting the partition child table directly, the plan changes. See below for "proof"
explain (analyze,buffers)
select *
from bigtable
where car_ref = 1769854207 and t >= '2020-01-01'::timestamp and t < '2021-01-01'::timestamp
limit 1
Limit (cost=24961.76..24962.67 rows=1 width=636) (actual time=1456.315..1456.316 rows=1 loops=1)
Buffers: shared hit=2377
-> Bitmap Heap Scan on bigtable_y2020 bigtable (cost=24961.76..2640351.94 rows=2874279 width=636) (actual time=1456.313..1456.314 rows=1 loops=1)
Recheck Cond: (car_ref = 1769854207)
Filter: ((t >= '2020-01-01 00:00:00'::timestamp without time zone) AND (t < '2021-01-01 00:00:00'::timestamp without time zone))
Heap Blocks: exact=1
Buffers: shared hit=2377
-> Bitmap Index Scan on bigtable_2020_ref_index (cost=0.00..24243.19 rows=2874336 width=0) (actual time=721.428..721.428 rows=2817169 loops=1)
Index Cond: (car_ref = 1769854207)
Buffers: shared hit=2376
Planning Time: 0.321 ms
Execution Time: 1480.087 ms
explain (analyze,buffers)
select *
from bigtable_y2020 tt
where car_ref = 1769854207 and t >= '2020-01-01'::timestamp and t < '2021-01-01'::timestamp
limit 1
Limit (cost=0.57..1.60 rows=1 width=636) (actual time=0.037..0.038 rows=1 loops=1)
Buffers: shared hit=5
-> Index Scan using bigtable_2020_ref_index on bigtable_y2020 tt (cost=0.57..2967225.58 rows=2874279 width=636) (actual time=0.036..0.036 rows=1 loops=1)
Index Cond: (car_ref = 1769854207)
Filter: ((t >= '2020-01-01 00:00:00'::timestamp without time zone) AND (t < '2021-01-01 00:00:00'::timestamp without time zone))
Buffers: shared hit=5
Planning Time: 0.349 ms
Execution Time: 0.106 ms
best regards
K