Hi,
We recently upgraded from pg 9.0.5 to 9.3.2 and we are observing much higher load on our hot standbys (we have 3). As you can see from the query plans below, we have some queries that are running 4-5 times slower now, many due to what looks like a bad plan in 9.3. Are there any known issues with query plan regressions in 9.3? Any ideas about how I can get back the old planning behavior with 9.3.2?
Thanks in advance for any help!
On Production System ---------------------- Postgres 9.3.2 Intel(R) Xeon(R) CPU E5649 (2.53 Ghz 6-core) 12 GB RAM Intel 710 SSD
--------------------- explain analyze select distinct on (t1.id) t1.id, t1.hostname as name, t1.active, t1.domain_id, t1.base, t1.port, t1.inter_domain_flag from location t1, host t2, container t3, resource_location t4 where t2.id = 34725278 and t3.id = t2.container_id and t4.location_id = t1.id and t4.parent_id in (select * from parentContainers(t3.id)) and t1.license is not null and (t1.license_end_date is null or t1.license_end_date >= current_date) and t1.active <> 0 and t3.active <> 0 and t4.active <> 0 and t1.domain_id = t2.domain_id and t2.domain_id = t3.domain_id and t3.domain_id = t4.domain_id and (0 = 0 or t1.active <> 0); QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Unique (cost=313.44..313.45 rows=1 width=35) (actual time=989.836..989.837 rows=1 loops=1) -> Sort (cost=313.44..313.44 rows=1 width=35) (actual time=989.836..989.837 rows=1 loops=1) Sort Key: t1.id Sort Method: quicksort Memory: 25kB -> Nested Loop (cost=1.27..313.43 rows=1 width=35) (actual time=922.484..989.791 rows=1 loops=1) Join Filter: (SubPlan 1) Rows Removed by Join Filter: 742 -> Nested Loop (cost=0.99..33.80 rows=1 width=53) (actual time=0.174..5.168 rows=934 loops=1) Join Filter: (t2.domain_id = t1.domain_id) -> Nested Loop (cost=0.71..11.23 rows=1 width=18) (actual time=0.101..0.103 rows=1 loops=1) -> Index Scan using host_pkey on host t2 (cost=0.29..5.29 rows=1 width=12) (actual time=0.041..0.042 rows=1 loops=1) Index Cond: (id = 34725278::numeric) -> Index Scan using container_pkey on container t3 (cost=0.42..5.43 rows=1 width=12) (actual time=0.057..0.058 rows=1 loops=1) Index Cond: (id = t2.container_id) Filter: ((active <> 0::numeric) AND (t2.domain_id = domain_id)) -> Index Scan using idx_location_domain_id on location t1 (cost=0.28..18.55 rows=8 width=35) (actual time=0.065..3.768 rows=934 loops=1) Index Cond: (domain_id = t3.domain_id) Filter: ((license IS NOT NULL) AND (active <> 0::numeric) AND ((license_end_date IS NULL) OR (license_end_date >= ('now'::cstring)::date))) Rows Removed by Filter: 297 -> Index Scan using idx_resource_location_domain_id on resource_location t4 (cost=0.28..27.63 rows=1 width=21) (actual time=0.532..0.849 rows=1 loops=934) Index Cond: (domain_id = t1.domain_id) Filter: ((active <> 0::numeric) AND (t1.id = location_id)) Rows Removed by Filter: 1003 SubPlan 1 -> Function Scan on parentcontainers (cost=0.25..500.25 rows=1000 width=32) (actual time=0.253..0.253 rows=2 loops=743) Total runtime: 990.045 ms (26 rows) On test box: ---------------------- Postgres 9.0.2 Intel(R) Xeon(R) CPU E5345 (2.33 Ghz 4-core) 8 GB RAM 6 x SAS 10K RAID 10
---------------------- explain analyze select distinct on (t1.id) t1.id, t1.hostname as name, t1.active, t1.domain_id, t1.base, t1.port, t1.inter_domain_flag from location t1, host t2, container t3, resource_location t4 where t2.id = 34725278 and t3.id = t2.container_id and t4.location_id = t1.id and t4.parent_id in (select * from parentContainers(t3.id)) and t1.license is not null and (t1.license_end_date is null or t1.license_end_date >= current_date) and t1.active <> 0 and t3.active <> 0 and t4.active <> 0 and t1.domain_id = t2.domain_id and t2.domain_id = t3.domain_id and t3.domain_id = t4.domain_id and (0 = 0 or t1.active <> 0); QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Unique (cost=389.96..389.97 rows=1 width=1192) (actual time=217.479..217.480 rows=1 loops=1) -> Sort (cost=389.96..389.97 rows=1 width=1192) (actual time=217.477..217.477 rows=1 loops=1) Sort Key: t1.id Sort Method: quicksort Memory: 25kB -> Nested Loop (cost=9.28..389.95 rows=1 width=1192) (actual time=103.359..217.437 rows=1 loops=1) Join Filter: ((t1.domain_id = t3.domain_id) AND (SubPlan 1)) -> Nested Loop (cost=9.28..130.66 rows=1 width=1320) (actual time=18.494..29.577 rows=744 loops=1) Join Filter: (t2.domain_id = t1.domain_id) -> Nested Loop (cost=9.28..49.44 rows=12 width=160) (actual time=18.434..21.279 rows=1000 loops=1) -> Index Scan using host_pkey on host t2 (cost=0.00..7.26 rows=1 width=64) (actual time=0.054..0.055 rows=1 loops=1) Index Cond: (id = 34725278::numeric) -> Bitmap Heap Scan on resource_location t4 (cost=9.28..36.15 rows=12 width=96) (actual time=18.370..20.638 rows=1000 loops=1) Recheck Cond: (t4.domain_id = t2.domain_id) Filter: (t4.active <> 0::numeric) -> Bitmap Index Scan on idx_resource_location_domain_id (cost=0.00..9.28 rows=12 width=0) (actual time=10.377..10.377 rows=1004 loops=1) Index Cond: (t4.domain_id = t2.domain_id) -> Index Scan using location_pkey on location t1 (cost=0.00..6.26 rows=1 width=1192) (actual time=0.006..0.007 rows=1 loops=1000) Index Cond: (t1.id = t4.location_id) Filter: ((t1.license IS NOT NULL) AND (t1.active <> 0::numeric) AND ((t1.license_end_date IS NULL) OR (t1.license_end_date >= ('now'::text)::date))) -> Index Scan using container_pkey on container t3 (cost=0.00..7.29 rows=1 width=64) (actual time=0.005..0.006 rows=1 loops=744) Index Cond: (t3.id = t2.container_id) Filter: (t3.active <> 0::numeric) SubPlan 1 -> Function Scan on parentcontainers (cost=0.25..500.25 rows=1000 width=32) (actual time=0.243..0.243 rows=2 loops=744) Total runtime: 217.735 ms