We recently took a copy of our production data (running on 8.4.2), scrubbed many data fields, and then loaded it onto a qa server (running 8.4.8). We're seeing some odd planner performance that I think might be a bug, though I'm hoping it's just idiocy on my part. I've analyzed things and looked into pg_stats and it seems as if the relevant columns have about the same statistics. I've managed to simplify the query, but if I make it any simpler, then the two servers end up with the same good plan. The query is down to: SELECT machines.quota_purchased FROM machines WHERE NOT deleted AND machines.user_id IN ( SELECT id FROM users WHERE user_group_id IN ( select 607547 offset 0 ) OFFSET 0 ); (Those "offset 0" are in there to protect us from planner regressions we saw when moving to 8.4. When we move to 9, they can hopefully go away.) On the production server, this returns a fairly accurate plan: QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=843.59..1447.90 rows=243 width=8) (actual time=0.044..0.045 rows=1 loops=1) -> HashAggregate (cost=843.59..845.59 rows=200 width=4) (actual time=0.027..0.027 rows=1 loops=1) -> Limit (cost=0.02..823.90 rows=1575 width=4) (actual time=0.024..0.025 rows=1 loops=1) -> Nested Loop (cost=0.02..823.90 rows=1575 width=4) (actual time=0.023..0.024 rows=1 loops=1) -> HashAggregate (cost=0.02..0.03 rows=1 width=4) (actual time=0.005..0.005 rows=1 loops=1) -> Limit (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.002 rows=1 loops=1) -> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.000..0.000 rows=1 loops=1) -> Index Scan using users_user_groups_idx on users (cost=0.00..804.18 rows=1575 width=8) (actual time=0.017..0.018 rows=1 loops=1) Index Cond: (users.user_group_id = (607547)) -> Index Scan using machines_sid_un on machines (cost=0.00..3.00 rows=1 width=12) (actual time=0.015..0.015 rows=1 loops=1) Index Cond: (machines.user_id = users.id) Total runtime: 0.121 ms (12 rows) On the QA server, things are not so accurate. It doesn't hurt the timing of this simplified query much, but when put into the actual query, the row estimation being off by 6 orders of magnitude really throws the planning in the wrong direction. The plan on the QA server is: QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=1887.16..3671.20 rows=1192462 width=8) (actual time=0.049..0.051 rows=1 loops=1) -> HashAggregate (cost=1887.16..1889.16 rows=200 width=4) (actual time=0.032..0.033 rows=1 loops=1) -> Limit (cost=0.02..1868.20 rows=1517 width=4) (actual time=0.027..0.029 rows=1 loops=1) -> Nested Loop (cost=0.02..1868.20 rows=1517 width=4) (actual time=0.027..0.028 rows=1 loops=1) -> HashAggregate (cost=0.02..0.03 rows=1 width=4) (actual time=0.008..0.008 rows=1 loops=1) -> Limit (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1) -> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1) -> Index Scan using users_user_groups_idx on users (cost=0.00..1849.20 rows=1517 width=8) (actual time=0.015..0.016 rows=1 loops=1) Index Cond: (users.user_group_id = (607547)) -> Index Scan using machines_sid_un on machines (cost=0.00..8.90 rows=1 width=12) (actual time=0.013..0.013 rows=1 loops=1) Index Cond: (machines.user_id = users.id) Total runtime: 0.148 ms (12 rows) The problem here (I think) seems to be that the QA server believes that running a nested loop over 200 users.id values and joining that against machines.user_id will result in >1M rows. The production servers sees this more accurately as the nearly 1:1 relationship that it is. The reason I wonder if this might be a bug is because if I change the obtuse clause "WHERE user_group_id IN (select 607547 offset 0)" to simply "where user_group_id in (607547)" then the plan collapses to the same plan on both servers: explain analyze SELECT machines.quota_purchased FROM machines WHERE NOT deleted AND machines.user_id IN ( SELECT id FROM users WHERE user_group_id IN ( 607547 ) OFFSET 0 ); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=82.27..636.70 rows=62 width=8) (actual time=0.042..0.043 rows=1 loops=1) -> HashAggregate (cost=82.27..82.88 rows=61 width=4) (actual time=0.024..0.025 rows=1 loops=1) -> Limit (cost=0.00..81.51 rows=61 width=4) (actual time=0.017..0.018 rows=1 loops=1) -> Index Scan using users_user_groups_idx on users (cost=0.00..81.51 rows=61 width=4) (actual time=0.015..0.016 rows=1 loops=1) Index Cond: (user_group_id = 607547) -> Index Scan using machines_sid_un on machines (cost=0.00..9.07 rows=1 width=12) (actual time=0.013..0.013 rows=1 loops=1) Index Cond: (machines.user_id = users.id) Total runtime: 0.106 ms (8 rows) But, as I understand it, that remaining OFFSET 0 should keep the planner from caring that it's a single value or the result of a subselect. Anyway, anything I can check? The hardware is different so the configuration files are different, but they're not that different in anything other than effective_cache_size. Certainly not different in any way I can think that would affect this kind of planning mis-estimation. |