I've been working on trying to normalize a table that's got a bunch of text fields. Normalizing the first 4 has been a non-issue. But when I try and normalize 2 additional fields a bunch of query plans go belly-up. I've seen this on our old 8.4 databases as well as 9.1. I haven't been able to test on anything newer yet. A representative query is below. Note that this is a stripped down version of something larger, so it seems a bit silly the way it's written, but I've got others that are exhibiting the same bad behavior: EXPLAIN ANALYZE select * from ( select distinct e.login AS Login, d.date1, d.date2, (SELECT COUNT(DISTINCT n.customer_id) FROM notes n inner join loans l on n.customer_id = l.customer_id inner join loan_tasks_committed as ltc on l.id = ltc.loan_id AND n.note_time BETWEEN date1 AND date2 + interval '1 day' AND n.activity_cd in ('help_draw') AND ltc.created_on between n.note_time - interval '10 minutes' and n.note_time + interval '1 day' AND getcust(ltc.created_by)=e.login where n.employee_id = e.id ) AS "Draw" FROM (select current_date-1 as date1, current_date-1 as date2) as d, employees e inner join employees_roles e_r on e.id=e_r.employee_id WHERE e_r.role_id IN (3,23) --Inbound Customer Support Operator or Customer Service Issue Rep GROUP BY Login, e.id, date1, date2 ORDER BY Login) a WHERE "Draw" > 0 ; loan_tasks_committed is the table that's been normalized. For testing I'm swapping between tables with two normalization views that boil down to: _by: SELECT lt_by.id, lt_by.lock_version, c.id AS created_by_id, c.by AS created_by, u.id AS updated_by_id, u.by AS updated_by, lt_by.created_on, lt_by.updated_on, lt_by.loan_id, lt_by.entered_on, lt_by.acct_date, lt_by.task_amount, lt_by.loan_task_cd, lt_by.parent_id, lt_by.type_cd, e.id AS entered_by_id, e.by AS entered_by, cl.id AS collected_by_id, cl.by AS collected_by, lt_by.currency_cd, lt_by.committed, lt_by.loan_task_code_id FROM lt_by LEFT JOIN by_text c ON lt_by.created_by_id = c.id LEFT JOIN by_text u ON lt_by.updated_by_id = u.id LEFT JOIN by_text e ON lt_by.entered_by_id = e.id LEFT JOIN by_text cl ON lt_by.collected_by_id = cl.id; _cd: SELECT lt_cd.id, lt_cd.lock_version, c.id AS created_by_id, c.by AS created_by, u.id AS updated_by_id, u.by AS updated_by, lt_cd.created_on, lt_cd.updated_on, lt_cd.loan_id, lt_cd.entered_on, lt_cd.acct_date, lt_cd.task_amount, ltc.id AS loan_task_code_id, ltc.loan_task_code, ltc.loan_task_code AS loan_task_cd, lt_cd.parent_id, tc.id AS loan_task_type_id, tc.loan_task_type, tc.loan_task_type AS type_cd, e.id AS entered_by_id, e.by AS entered_by, cl.id AS collected_by_id, cl.by AS collected_by, lt_cd.currency_cd, lt_cd.committed FROM lt_cd LEFT JOIN by_text c ON lt_cd.created_by_id = c.id LEFT JOIN by_text u ON lt_cd.updated_by_id = u.id LEFT JOIN by_text e ON lt_cd.entered_by_id = e.id LEFT JOIN by_text cl ON lt_cd.collected_by_id = cl.id LEFT JOIN lt_code ltc ON lt_cd.loan_task_code_id = ltc.id LEFT JOIN lt_type tc ON lt_cd.loan_task_type_id = tc.id; As you can see, they're identical except for normalizing 2 additional fields. The additional normalization results in a moderate amount of heap space savings: SELECT table_name, rows, heap_size, index_size, toast_size, total_size FROM tools.space WHERE table_schema='jnasby'; table_name | rows | heap_size | index_size | toast_size | total_size ------------------+-------------+------------+------------+------------+------------ by_text | 721492 | 41 MB | 49 MB | 40 kB | 90 MB lt_code | 42 | 8192 bytes | 32 kB | 32 kB | 72 kB lt_type | 3 | 8192 bytes | 32 kB | 32 kB | 72 kB lt_cd | 9.82601e+07 | 10 GB | 15 GB | 2832 kB | 25 GB lt_by | 9.82615e+07 | 12 GB | 21 GB | 3360 kB | 33 GB I've got full explain analyze below, but the relevant bits are: _by: SubPlan 2 -> Aggregate (cost=489.06..489.07 rows=1 width=4) (actual time=0.083..0.083 rows=1 loops=692) -> Nested Loop (cost=0.00..489.05 rows=1 width=4) (actual time=0.080..0.080 rows=0 loops=692) -> Nested Loop (cost=0.00..485.80 rows=10 width=8) (actual time=0.079..0.079 rows=0 loops=692) Join Filter: ((jnasby.lt_by.created_on >= (n.note_time - '00:10:00'::interval)) AND (jnasby.lt_by.created_on <= (n.note_time + '1 day'::interval))) -> Nested Loop (cost=0.00..383.44 rows=14 width=16) (actual time=0.054..0.059 rows=0 loops=692) -> Index Scan using notes_u1 on notes n (cost=0.00..344.23 rows=1 width=12) (actual time=0.053..0.057 rows=0 loops=692) Index Cond: ((employee_id = e.id) AND (note_time >= ((('now'::text)::date - 1))) AND (note_time <= (((('now'::text)::date - 1)) + '1 day'::interval))) Filter: ((activity_cd)::text = 'help_draw'::text) -> Index Scan using loans_m12 on loans l (cost=0.00..38.94 rows=22 width=8) (actual time=0.034..0.042 rows=2 loops=18) Index Cond: (customer_id = n.customer_id) -> Index Scan using lt_by__loan_id on lt_by (cost=0.00..6.50 rows=36 width=28) (actual time=0.043..0.477 rows=60 loops=28) Index Cond: (loan_id = l.id) -> Index Scan using by_text_pkey on by_text c (cost=0.00..0.31 rows=1 width=4) (never executed) Index Cond: (id = jnasby.lt_by.created_by_id) Filter: ("substring"((by)::text, '^[CAE]:(.*?) I:'::text) = (e.login)::text) _cd: SubPlan 2 -> Aggregate (cost=3089331.15..3089331.16 rows=1 width=4) (actual time=372.589..372.589 rows=1 loops=692) -> Hash Join (cost=16560.08..3089331.14 rows=1 width=4) (actual time=372.586..372.586 rows=0 loops=692) Hash Cond: (jnasby.lt_cd.loan_id = l.id) Join Filter: ((jnasby.lt_cd.created_on >= (n.note_time - '00:10:00'::interval)) AND (jnasby.lt_cd.created_on <= (n.note_time + '1 day'::interval))) -> Hash Join (cost=16176.47..3087105.37 rows=491238 width=12) (actual time=14610.882..32223.008 rows=2642 loops=8) Hash Cond: (jnasby.lt_cd.created_by_id = c.id) -> Seq Scan on lt_cd (cost=0.00..2329065.44 rows=98260144 width=32) (actual time=0.018..15123.405 rows=98261588 loops=8) -> Hash (cost=16131.38..16131.38 rows=3607 width=4) (actual time=4519.878..4519.878 rows=20 loops=8) Buckets: 1024 Batches: 1 Memory Usage: 1kB -> Seq Scan on by_text c (cost=0.00..16131.38 rows=3607 width=4) (actual time=4405.172..4519.861 rows=20 loops=8) Filter: ("substring"((by)::text, '^[CAE]:(.*?) I:'::text) = (e.login)::text) -> Hash (cost=383.44..383.44 rows=14 width=16) (actual time=0.055..0.055 rows=0 loops=692) Buckets: 1024 Batches: 1 Memory Usage: 0kB -> Nested Loop (cost=0.00..383.44 rows=14 width=16) (actual time=0.051..0.055 rows=0 loops=692) -> Index Scan using notes_u1 on notes n (cost=0.00..344.23 rows=1 width=12) (actual time=0.049..0.052 rows=0 loops=692) Index Cond: ((employee_id = e.id) AND (note_time >= ((('now'::text)::date - 1))) AND (note_time <= (((('now'::text)::date - 1)) + '1 day'::interval))) Filter: ((activity_cd)::text = 'help_draw'::text) -> Index Scan using loans_m12 on loans l (cost=0.00..38.94 rows=22 width=8) (actual time=0.033..0.040 rows=2 loops=18) Index Cond: (customer_id = n.customer_id) I've tried disabling execution paths to get it to loop join, but no matter what it insists on seqscanning lt_cd. I've verified that simpler queries will use an index scan of loan_id (ie: WHERE loan_id < 999999). Again, I've got multiple queries exhibiting this problem... this is NOT directly related to this query. _by: http://explain.depesz.com/s/LYOY QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Subquery Scan on a (cost=843581.41..843600.81 rows=862 width=24) (actual time=64.735..64.735 rows=0 loops=1) -> Unique (cost=843581.41..843592.19 rows=862 width=20) (actual time=64.735..64.735 rows=0 loops=1) -> Sort (cost=843581.41..843583.57 rows=862 width=20) (actual time=64.733..64.733 rows=0 loops=1) Sort Key: e.login, ((('now'::text)::date - 1)), ((('now'::text)::date - 1)), ((SubPlan 1)) Sort Method: quicksort Memory: 25kB -> HashAggregate (cost=380.35..843539.38 rows=862 width=20) (actual time=64.726..64.726 rows=0 loops=1) Filter: ((SubPlan 2) > 0) -> Hash Join (cost=305.29..371.73 rows=862 width=20) (actual time=4.765..5.866 rows=895 loops=1) Hash Cond: (e_r.employee_id = e.id) -> Nested Loop (cost=15.18..67.61 rows=862 width=12) (actual time=0.228..0.976 rows=895 loops=1) -> Result (cost=0.00..0.03 rows=1 width=0) (actual time=0.009..0.009 rows=1 loops=1) -> Bitmap Heap Scan on employees_roles e_r (cost=15.18..58.96 rows=862 width=4) (actual time=0.214..0.782 rows=895 loops=1) Recheck Cond: (role_id = ANY ('{3,23}'::integer[])) -> Bitmap Index Scan on employees_roles_m1 (cost=0.00..14.97 rows=862 width=0) (actual time=0.195..0.195 rows=895 loops=1) Index Cond: (role_id = ANY ('{3,23}'::integer[])) -> Hash (cost=247.27..247.27 rows=3427 width=12) (actual time=4.521..4.521 rows=3427 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 154kB -> Seq Scan on employees e (cost=0.00..247.27 rows=3427 width=12) (actual time=0.021..3.595 rows=3427 loops=1) SubPlan 1 -> Aggregate (cost=489.06..489.07 rows=1 width=4) (never executed) -> Nested Loop (cost=0.00..489.05 rows=1 width=4) (never executed) -> Nested Loop (cost=0.00..485.80 rows=10 width=8) (never executed) Join Filter: ((jnasby.lt_by.created_on >= (n.note_time - '00:10:00'::interval)) AND (jnasby.lt_by.created_on <= (n.note_time + '1 day'::interval))) -> Nested Loop (cost=0.00..383.44 rows=14 width=16) (never executed) -> Index Scan using notes_u1 on notes n (cost=0.00..344.23 rows=1 width=12) (never executed) Index Cond: ((employee_id = e.id) AND (note_time >= ((('now'::text)::date - 1))) AND (note_time <= (((('now'::text)::date - 1)) + '1 day'::interval))) Filter: ((activity_cd)::text = 'help_draw'::text) -> Index Scan using loans_m12 on loans l (cost=0.00..38.94 rows=22 width=8) (never executed) Index Cond: (customer_id = n.customer_id) -> Index Scan using lt_by__loan_id on lt_by (cost=0.00..6.50 rows=36 width=28) (never executed) Index Cond: (loan_id = l.id) -> Index Scan using by_text_pkey on by_text c (cost=0.00..0.31 rows=1 width=4) (never executed) Index Cond: (id = jnasby.lt_by.created_by_id) Filter: ("substring"((by)::text, '^[CAE]:(.*?) I:'::text) = (e.login)::text) SubPlan 2 -> Aggregate (cost=489.06..489.07 rows=1 width=4) (actual time=0.083..0.083 rows=1 loops=692) -> Nested Loop (cost=0.00..489.05 rows=1 width=4) (actual time=0.080..0.080 rows=0 loops=692) -> Nested Loop (cost=0.00..485.80 rows=10 width=8) (actual time=0.079..0.079 rows=0 loops=692) Join Filter: ((jnasby.lt_by.created_on >= (n.note_time - '00:10:00'::interval)) AND (jnasby.lt_by.created_on <= (n.note_time + '1 day'::interval))) -> Nested Loop (cost=0.00..383.44 rows=14 width=16) (actual time=0.054..0.059 rows=0 loops=692) -> Index Scan using notes_u1 on notes n (cost=0.00..344.23 rows=1 width=12) (actual time=0.053..0.057 rows=0 loops=692) Index Cond: ((employee_id = e.id) AND (note_time >= ((('now'::text)::date - 1))) AND (note_time <= (((('now'::text)::date - 1)) + '1 day'::interval))) Filter: ((activity_cd)::text = 'help_draw'::text) -> Index Scan using loans_m12 on loans l (cost=0.00..38.94 rows=22 width=8) (actual time=0.034..0.042 rows=2 loops=18) Index Cond: (customer_id = n.customer_id) -> Index Scan using lt_by__loan_id on lt_by (cost=0.00..6.50 rows=36 width=28) (actual time=0.043..0.477 rows=60 loops=28) Index Cond: (loan_id = l.id) -> Index Scan using by_text_pkey on by_text c (cost=0.00..0.31 rows=1 width=4) (never executed) Index Cond: (id = jnasby.lt_by.created_by_id) Filter: ("substring"((by)::text, '^[CAE]:(.*?) I:'::text) = (e.login)::text) Total runtime: 65.070 ms (51 rows) _cd: http://explain.depesz.com/s/fkc QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Subquery Scan on a (cost=5326007345.03..5326007364.42 rows=862 width=24) (actual time=257838.254..257838.254 rows=0 loops=1) -> Unique (cost=5326007345.03..5326007355.80 rows=862 width=20) (actual time=257838.254..257838.254 rows=0 loops=1) -> Sort (cost=5326007345.03..5326007347.18 rows=862 width=20) (actual time=257838.254..257838.254 rows=0 loops=1) Sort Key: e.login, ((('now'::text)::date - 1)), ((('now'::text)::date - 1)), ((SubPlan 1)) Sort Method: quicksort Memory: 25kB -> HashAggregate (cost=380.35..5326007303.00 rows=862 width=20) (actual time=257838.246..257838.246 rows=0 loops=1) Filter: ((SubPlan 2) > 0) -> Hash Join (cost=305.29..371.73 rows=862 width=20) (actual time=3.916..4.789 rows=895 loops=1) Hash Cond: (e_r.employee_id = e.id) -> Nested Loop (cost=15.18..67.61 rows=862 width=12) (actual time=0.208..0.788 rows=895 loops=1) -> Result (cost=0.00..0.03 rows=1 width=0) (actual time=0.006..0.006 rows=1 loops=1) -> Bitmap Heap Scan on employees_roles e_r (cost=15.18..58.96 rows=862 width=4) (actual time=0.191..0.626 rows=895 loops=1) Recheck Cond: (role_id = ANY ('{3,23}'::integer[])) -> Bitmap Index Scan on employees_roles_m1 (cost=0.00..14.97 rows=862 width=0) (actual time=0.173..0.173 rows=895 loops=1) Index Cond: (role_id = ANY ('{3,23}'::integer[])) -> Hash (cost=247.27..247.27 rows=3427 width=12) (actual time=3.696..3.696 rows=3427 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 154kB -> Seq Scan on employees e (cost=0.00..247.27 rows=3427 width=12) (actual time=0.012..2.967 rows=3427 loops=1) SubPlan 1 -> Aggregate (cost=3089331.15..3089331.16 rows=1 width=4) (never executed) -> Hash Join (cost=16560.08..3089331.14 rows=1 width=4) (never executed) Hash Cond: (jnasby.lt_cd.loan_id = l.id) Join Filter: ((jnasby.lt_cd.created_on >= (n.note_time - '00:10:00'::interval)) AND (jnasby.lt_cd.created_on <= (n.note_time + '1 day'::interval))) -> Hash Join (cost=16176.47..3087105.37 rows=491238 width=12) (never executed) Hash Cond: (jnasby.lt_cd.created_by_id = c.id) -> Seq Scan on lt_cd (cost=0.00..2329065.44 rows=98260144 width=32) (never executed) -> Hash (cost=16131.38..16131.38 rows=3607 width=4) (never executed) -> Seq Scan on by_text c (cost=0.00..16131.38 rows=3607 width=4) (never executed) Filter: ("substring"((by)::text, '^[CAE]:(.*?) I:'::text) = (e.login)::text) -> Hash (cost=383.44..383.44 rows=14 width=16) (never executed) -> Nested Loop (cost=0.00..383.44 rows=14 width=16) (never executed) -> Index Scan using notes_u1 on notes n (cost=0.00..344.23 rows=1 width=12) (never executed) Index Cond: ((employee_id = e.id) AND (note_time >= ((('now'::text)::date - 1))) AND (note_time <= (((('now'::text)::date - 1)) + '1 day'::interval))) Filter: ((activity_cd)::text = 'help_draw'::text) -> Index Scan using loans_m12 on loans l (cost=0.00..38.94 rows=22 width=8) (never executed) Index Cond: (customer_id = n.customer_id) SubPlan 2 -> Aggregate (cost=3089331.15..3089331.16 rows=1 width=4) (actual time=372.589..372.589 rows=1 loops=692) -> Hash Join (cost=16560.08..3089331.14 rows=1 width=4) (actual time=372.586..372.586 rows=0 loops=692) Hash Cond: (jnasby.lt_cd.loan_id = l.id) Join Filter: ((jnasby.lt_cd.created_on >= (n.note_time - '00:10:00'::interval)) AND (jnasby.lt_cd.created_on <= (n.note_time + '1 day'::interval))) -> Hash Join (cost=16176.47..3087105.37 rows=491238 width=12) (actual time=14610.882..32223.008 rows=2642 loops=8) Hash Cond: (jnasby.lt_cd.created_by_id = c.id) -> Seq Scan on lt_cd (cost=0.00..2329065.44 rows=98260144 width=32) (actual time=0.018..15123.405 rows=98261588 loops=8) -> Hash (cost=16131.38..16131.38 rows=3607 width=4) (actual time=4519.878..4519.878 rows=20 loops=8) Buckets: 1024 Batches: 1 Memory Usage: 1kB -> Seq Scan on by_text c (cost=0.00..16131.38 rows=3607 width=4) (actual time=4405.172..4519.861 rows=20 loops=8) Filter: ("substring"((by)::text, '^[CAE]:(.*?) I:'::text) = (e.login)::text) -> Hash (cost=383.44..383.44 rows=14 width=16) (actual time=0.055..0.055 rows=0 loops=692) Buckets: 1024 Batches: 1 Memory Usage: 0kB -> Nested Loop (cost=0.00..383.44 rows=14 width=16) (actual time=0.051..0.055 rows=0 loops=692) -> Index Scan using notes_u1 on notes n (cost=0.00..344.23 rows=1 width=12) (actual time=0.049..0.052 rows=0 loops=692) Index Cond: ((employee_id = e.id) AND (note_time >= ((('now'::text)::date - 1))) AND (note_time <= (((('now'::text)::date - 1)) + '1 day'::interval))) Filter: ((activity_cd)::text = 'help_draw'::text) -> Index Scan using loans_m12 on loans l (cost=0.00..38.94 rows=22 width=8) (actual time=0.033..0.040 rows=2 loops=18) Index Cond: (customer_id = n.customer_id) Total runtime: 257838.571 ms (57 rows) -- Jim Nasby, Lead Data Architect (512) 569-9461 -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance