2016-03-16 21:23 GMT+01:00 Doiron, Daniel <DoironD@xxxxxxxxxxxx>:
I have the following queries:
EXPLAIN (ANALYZE, VERBOSE, COSTS, BUFFERS, TIMING)select […]from f_calc_service a11,d_patient_type a12where a11.d_patient_pop_id in (336)and a11.d_patient_type_id = a12.idand a12.short_name = 'I'group by a11.d_rate_schedule_id,a11.d_payer_id,a11.d_patient_pop_id,a11.d_patient_type_id;
And
EXPLAIN (ANALYZE, VERBOSE, COSTS, BUFFERS, TIMING)select […]from f_calc_service a11,d_patient_type a12where a11.d_patient_pop_id in (336)and a11.d_patient_type_id = a12.idand a12.short_name = 'O'group by a11.d_rate_schedule_id,a11.d_payer_id,a11.d_patient_pop_id,a11.d_patient_type_id;
Making this one change from short_name = ‘I’ to short_name = ‘O’ changes the query execution from 200k ms to 280ms. The first one chooses a Nested Loop, the second chooses a hash join. How do I get them both to choose the same? There are no values for d_patient_pop_id in (336) and short_name = ‘I’.
we don't see plans, so it is blind shot,
Probably the estimation for 'I' value is pretty underestimated - so planner choose nested loop. The reasons can be different - possible correlation inside data for example.
You can try:
0) ensure so your statistic are current - run statement ANALYZE
a) increase statistic by statement ALTER TABLE xx ALTER COLUMN yyy SET STATISTICS some number
b) penalize nested loop - statement SET enable_nestloop TO off;
Regards
Pavel
Thanks!
Dan