Is there some way to tell the planner that unless it's guaranteed by a constraint or some such it shouldn't guess that the selectivity of a filter/anti-join is 1 row (e.g. minimum to consider is 2 rows unless it's guaranteed to be 1 row) or somehow otherwise make it more conservative around the worst case possibilities. I feel like this would cover something like 1/3 of the more problematic planner performance issues I run into. The kind where a query suddenly runs 60,000 times slower than it did previously. I can live with some queries being slightly slower if I can avoid the case where they will all of sudden never complete.
My current motivating example is this (... abridged) query:
postgresql 11.7 on ubuntu linux
-> **Nested Loop Left Join** (cost=3484616.45..5873755.65 rows=1 width=295)
Join Filter: (hsc.c_field = c.id)
...
-> *Nested Loop Left Join (cost=1072849.19..3286800.99 rows=1 width=190)*
-> Hash Anti Join (cost=1072848.62..3286798.53 ***rows=1***[actually 65k] width=189)
Hash Cond: (c.id = trc.field)
-> Seq Scan on c (cost=0.00..1096064.73 rows=14328573 width=189)
-> Hash (cost=830118.31..830118.31 rows=14794985 width=4)
-> Seq Scan on trc (cost=0.00..830118.31 rows=14794985 width=4)
-> Index Scan using con_pkey on con (cost=0.56..2.46 rows=1 width=9)
Index Cond: (c.con_field = id)
...
-> Unique (cost=2411766.83..2479065.82 rows=4794957 width=29)
-> Sort (cost=2411766.83..2445416.33 rows=13459797 width=29)
Sort Key: hsc.c_field, xxx
-> Hash Join (cost=11143.57..599455.83 rows=13459797 width=29)
...
*** is where the planner is off in it's row estimation
c.id is unique for that table, statistics set to 10k and freshly analyzed
trc.field is unique for that table, statistics set to 10k and freshly analyzed
row estimates for those tables are pretty close to correct (within a couple of %)
there is no foreign key constraint between those two tables
c.id and trc.field are both integers with pretty similar distributions over 1...22 million
** is where it picks a disastrous join plan based on that misstaken-row estimate
this has to be a close call with doing a merge_join as the other side is already sorted
* this join is ok, since even if it isn't the fastest join here with the correct row count, given the index it's not much worse
I can work around this by breaking up the query (e.g. creating a temporary table of the selected ids, analyzing it then using it in the rest of the query) or by temporarily disabling nestedloop joins (which makes other parts of the query slower, but not dramatically so), but is there some other reasonable proactive way to avoid it? It was running fine for a year before blowing up (trigger is I suspect the trc table getting enough larger than the c table, originally it was smaller) and I hit similarish kinds of issues every so often.
Tim