Sorry for the delay; back on this, and thanks for the response. On Mar 11, 2014, at 6:23 PM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote: > "Murphy, Kevin" <MURPHYKE@xxxxxxxxxxxxxx> writes: >> Synopsis: 8-table join with one "WHERE foo IN (...)" condition; works OK with fewer >> than 5 items in the IN list, but at N=5, the planner starts using a compound index >> for the first time that completely kills performance (5-6 minutes versus 0-12 seconds). >> […] > > FWIW, I think the right question here is not "why is the slow query > slow?", but "why is the fast query fast?”. > […] > It looks like the trouble spot is this intermediate nested loop: > >> -> Nested Loop (cost=4.32..283545.98 rows=80929 width=12) (actual time=163.609..571.237 rows=102 loops=1) >> Buffers: shared hit=419 read=63 >> -> Nested Loop (cost=4.32..3426.09 rows=471 width=4) (actual time=93.595..112.404 rows=85 loops=1) >> ... >> -> Index Scan using sample_result_variant_id on sample_result (cost=0.00..593.01 rows=172 width=8) (actual time=5.147..5.397 rows=1 loops=85) >> Index Cond: (variant_id = variant_effect.variant_id) >> Buffers: shared hit=400 read=42 > > which is creating the bulk of the estimated cost for the whole plan, > but execution is actually pretty cheap. There seem to be two components > to the misestimation: one is that the sub-nested loop is producing about a > fifth as many rows as expected, This may be because 3 out of the 4 user-supplied gene symbols were not present in the gene table at all. Restricting to valid genes prior to the query is probably a good idea. > and the other is that the probes into > sample_result are producing (on average) 1 row, not the 172 rows the > planner expects. If you could get the latter estimate to be even within > one order of magnitude of reality, the planner would certainly see this > plan as way cheaper than the other. I’m not sure about how to improve this. The stats were 5K globally and up to date, and I made them better, with no change. I tried increasing the stats on the foreign keys involved to 10K (and analyzing), but the same costs and plan are in play. I know the stats are updated now because I dumped and restored on new hardware and did a vacuum analyze. I previously mentioned that some of the vanilla n_distinct values were way off for the (790M row) sample_result table, so I have taken to coercing n_distinct using negative multipliers. This data doesn’t change very often (it hasn’t in many weeks). There are 6M variants, but only 7.5% of them map to the sample_result table. Presumably the planner knows this because of the n_distinct value on sample_result.variant_id? Each variant maps to zero or sample_result records, but often very few, and never more than the number of samples (currently 1129). > > So I'm wondering if the stats on sample_result and variant_effect are up > to date. If they are, you might try increasing the stats targets for the > variant_id columns. The stats were up to date and were at 5K globally. I tried increasing the stats on the foreign keys involved to 10K (and analyzing!), but the same costs and plan are in play. I know the stats are updated now because I dumped and restored on new hardware and did a vacuum analyze. I previously mentioned that some of the vanilla n_distinct values were way off for the (790M row) sample_result table, so I have taken to coercing n_distinct using negative multipliers. Regards, Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance