Re: Regression from 9.4-9.6

Jim Nasby <jim@xxxxxxxxx> · Sun, 8 Oct 2017 15:33:07 -0500

On 10/8/17 3:02 PM, Jim Nasby wrote:

-> Index Scan using bdata_filed_departuretime on bdata_forks 
(cost=0.57..50807.51 rows=1 width=36) (actual time=979.381..3207.777 
rows=508 loops=1)

-> Index Scan using bdata_filed_departuretime on bdata_forks 
(cost=0.57..14894236.06 rows=1 width=36) (actual 
time=892.664..3025.653 rows=508 loops=1)

I think the reason it's discarding the preferable plan is that, with this
huge increment in the estimated cost getting added to both alternatives,
the two nestloop plans have fuzzily the same total cost, and it's picking
the one you don't want on the basis of some secondary criterion.

Great question... the only thing that sticks out is the coalesce(). Let 
me see if an analyze with a higher stats target changes anything. FWIW, 
the 9.6 database is copied from the 9.4 one once a week and then 
pg_upgraded. I'm pretty sure an ANALYZE is part of that process.

Turns out that analyze is the 'problem'. On the 9.4 database, pg_stats 
shows that the newest date in filed_departuretime is 3/18/2017, while 
the 9.6 database is up-to-date. If I change the query to use 2/9/2018 
instead of 7/20/2017 I get the same results.

So, the larger cost estimate is theoretically more correct. If I set 
random_page_cost = 1 I end up with a good plan.
--
Jim C. Nasby, Data Architect                       jim@xxxxxxxxx
512.569.9461 (cell)                         http://jim.nasby.net

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance