On Mon, Aug 23, 2021 at 08:53:15PM -0400, Matt Dupree wrote: > Is it possible that the row estimate is off because of a column other than > time? I would test this by writing the simplest query that reproduces the mis-estimate. > I looked at the # of events in that time period and 1.8 million is > actually a good estimate. What about the > ((strpos(other_events_1004175222.hierarchy, '#close_onborading;'::text) <> > 0) condition in the filter? It makes sense that Postgres wouldn't have a > way to estimate how selective this condition is. The issue I see is here. I don't know where else I'd start but to understand this. | Index Scan using other_events_1004175222_pim_evdef_67951aef14bc_idx on public.other_events_1004175222 (cost=0.28..1,648,877.92 ROWS=1,858,891 width=32) (actual time=1.008..15.245 ROWS=23 loops=1) | Output: other_events_1004175222.user_id, other_events_1004175222."time", other_events_1004175222.session_id | Index Cond: ((other_events_1004175222."time" >= '1624777200000'::bigint) AND (other_events_1004175222."time" <= '1627369200000'::bigint)) | Buffers: shared read=25 This has no "filter" condition, it's a "scan" node with bad over-estimate. Note that this is due to the table's column stats, not any index's stats, so every plan is affected. even though some happen to work well. The consequences of over-estimates are not as terrible as for under-estimates, but it's bad to start with inputs that are off by 10^5. -- Justin