Problems with hash join over nested loop

Jim Nasby <jnasby@xxxxxxxxx> · Mon, 28 Oct 2013 16:58:15 -0500

I've been working on trying to normalize a table that's got a bunch of text fields. Normalizing the first 4 has been a non-issue. But when I try and normalize 2 additional fields a bunch of query plans go belly-up.

I've seen this on our old 8.4 databases as well as 9.1. I haven't been able to test on anything newer yet.

A representative query is below. Note that this is a stripped down version of something larger, so it seems a bit silly the way it's written, but I've got others that are exhibiting the same bad behavior:

EXPLAIN ANALYZE
select * from (
select distinct e.login AS Login, d.date1, d.date2,
 (SELECT COUNT(DISTINCT n.customer_id) FROM notes n
     inner join loans l on n.customer_id = l.customer_id
     inner join loan_tasks_committed as ltc on l.id = ltc.loan_id
      AND n.note_time BETWEEN date1 AND date2 + interval '1 day'
      AND n.activity_cd in ('help_draw')
      AND ltc.created_on between n.note_time - interval '10 minutes' and n.note_time + interval '1 day'
      AND getcust(ltc.created_by)=e.login
      where n.employee_id = e.id
    ) AS "Draw"
 FROM
   (select current_date-1 as date1, current_date-1 as date2) as d, employees e
   inner join employees_roles e_r on e.id=e_r.employee_id
 WHERE
   e_r.role_id IN (3,23) --Inbound Customer Support Operator or Customer Service Issue Rep
 GROUP BY
   Login,
   e.id,
   date1,
   date2
 ORDER BY Login) a
 WHERE "Draw" > 0
;

loan_tasks_committed is the table that's been normalized. For testing I'm swapping between tables with two normalization views that boil down to:

_by:
 SELECT lt_by.id, lt_by.lock_version, c.id AS created_by_id, c.by AS created_by, u.id AS updated_by_id, u.by AS updated_by, lt_by.created_on, lt_by.updated_on, lt_by.loan_id, lt_by.entered_on, lt_by.acct_date, lt_by.task_amount, lt_by.loan_task_cd, lt_by.parent_id, lt_by.type_cd, e.id AS entered_by_id, e.by AS entered_by, cl.id AS collected_by_id, cl.by AS collected_by, lt_by.currency_cd, lt_by.committed, lt_by.loan_task_code_id
   FROM lt_by
   LEFT JOIN by_text c ON lt_by.created_by_id = c.id
   LEFT JOIN by_text u ON lt_by.updated_by_id = u.id
   LEFT JOIN by_text e ON lt_by.entered_by_id = e.id
   LEFT JOIN by_text cl ON lt_by.collected_by_id = cl.id;

_cd:
 SELECT lt_cd.id, lt_cd.lock_version, c.id AS created_by_id, c.by AS created_by, u.id AS updated_by_id, u.by AS updated_by, lt_cd.created_on, lt_cd.updated_on, lt_cd.loan_id, lt_cd.entered_on, lt_cd.acct_date, lt_cd.task_amount, ltc.id AS loan_task_code_id, ltc.loan_task_code, ltc.loan_task_code AS loan_task_cd, lt_cd.parent_id, tc.id AS loan_task_type_id, tc.loan_task_type, tc.loan_task_type AS type_cd, e.id AS entered_by_id, e.by AS entered_by, cl.id AS collected_by_id, cl.by AS collected_by, lt_cd.currency_cd, lt_cd.committed
   FROM lt_cd
   LEFT JOIN by_text c ON lt_cd.created_by_id = c.id
   LEFT JOIN by_text u ON lt_cd.updated_by_id = u.id
   LEFT JOIN by_text e ON lt_cd.entered_by_id = e.id
   LEFT JOIN by_text cl ON lt_cd.collected_by_id = cl.id
   LEFT JOIN lt_code ltc ON lt_cd.loan_task_code_id = ltc.id
   LEFT JOIN lt_type tc ON lt_cd.loan_task_type_id = tc.id;

As you can see, they're identical except for normalizing 2 additional fields.

The additional normalization results in a moderate amount of heap space savings:

SELECT table_name, rows, heap_size, index_size, toast_size, total_size FROM tools.space WHERE table_schema='jnasby';
    table_name    |    rows     | heap_size  | index_size | toast_size | total_size
------------------+-------------+------------+------------+------------+------------
 by_text          |      721492 | 41 MB      | 49 MB      | 40 kB      | 90 MB
 lt_code          |          42 | 8192 bytes | 32 kB      | 32 kB      | 72 kB
 lt_type          |           3 | 8192 bytes | 32 kB      | 32 kB      | 72 kB
 lt_cd            | 9.82601e+07 | 10 GB      | 15 GB      | 2832 kB    | 25 GB
 lt_by            | 9.82615e+07 | 12 GB      | 21 GB      | 3360 kB    | 33 GB

I've got full explain analyze below, but the relevant bits are:

_by:
 SubPlan 2
   ->  Aggregate  (cost=489.06..489.07 rows=1 width=4) (actual time=0.083..0.083 rows=1 loops=692)
         ->  Nested Loop  (cost=0.00..489.05 rows=1 width=4) (actual time=0.080..0.080 rows=0 loops=692)
               ->  Nested Loop  (cost=0.00..485.80 rows=10 width=8) (actual time=0.079..0.079 rows=0 loops=692)
                     Join Filter: ((jnasby.lt_by.created_on >= (n.note_time - '00:10:00'::interval)) AND (jnasby.lt_by.created_on <= (n.note_time + '1 day'::interval)))
                     ->  Nested Loop  (cost=0.00..383.44 rows=14 width=16) (actual time=0.054..0.059 rows=0 loops=692)
                           ->  Index Scan using notes_u1 on notes n  (cost=0.00..344.23 rows=1 width=12) (actual time=0.053..0.057 rows=0 loops=692)
                                 Index Cond: ((employee_id = e.id) AND (note_time >= ((('now'::text)::date - 1))) AND (note_time <= (((('now'::text)::date - 1)) + '1 day'::interval)))
                                 Filter: ((activity_cd)::text = 'help_draw'::text)
                           ->  Index Scan using loans_m12 on loans l  (cost=0.00..38.94 rows=22 width=8) (actual time=0.034..0.042 rows=2 loops=18)
                                 Index Cond: (customer_id = n.customer_id)
                     ->  Index Scan using lt_by__loan_id on lt_by  (cost=0.00..6.50 rows=36 width=28) (actual time=0.043..0.477 rows=60 loops=28)
                           Index Cond: (loan_id = l.id)
               ->  Index Scan using by_text_pkey on by_text c  (cost=0.00..0.31 rows=1 width=4) (never executed)
                     Index Cond: (id = jnasby.lt_by.created_by_id)
                     Filter: ("substring"((by)::text, '^[CAE]:(.*?) I:'::text) = (e.login)::text)

_cd:
 SubPlan 2
   ->  Aggregate  (cost=3089331.15..3089331.16 rows=1 width=4) (actual time=372.589..372.589 rows=1 loops=692)
         ->  Hash Join  (cost=16560.08..3089331.14 rows=1 width=4) (actual time=372.586..372.586 rows=0 loops=692)
               Hash Cond: (jnasby.lt_cd.loan_id = l.id)
               Join Filter: ((jnasby.lt_cd.created_on >= (n.note_time - '00:10:00'::interval)) AND (jnasby.lt_cd.created_on <= (n.note_time + '1 day'::interval)))
               ->  Hash Join  (cost=16176.47..3087105.37 rows=491238 width=12) (actual time=14610.882..32223.008 rows=2642 loops=8)
                     Hash Cond: (jnasby.lt_cd.created_by_id = c.id)
                     ->  Seq Scan on lt_cd  (cost=0.00..2329065.44 rows=98260144 width=32) (actual time=0.018..15123.405 rows=98261588 loops=8)
                     ->  Hash  (cost=16131.38..16131.38 rows=3607 width=4) (actual time=4519.878..4519.878 rows=20 loops=8)
                           Buckets: 1024  Batches: 1  Memory Usage: 1kB
                           ->  Seq Scan on by_text c  (cost=0.00..16131.38 rows=3607 width=4) (actual time=4405.172..4519.861 rows=20 loops=8)
                                 Filter: ("substring"((by)::text, '^[CAE]:(.*?) I:'::text) = (e.login)::text)
               ->  Hash  (cost=383.44..383.44 rows=14 width=16) (actual time=0.055..0.055 rows=0 loops=692)
                     Buckets: 1024  Batches: 1  Memory Usage: 0kB
                     ->  Nested Loop  (cost=0.00..383.44 rows=14 width=16) (actual time=0.051..0.055 rows=0 loops=692)
                           ->  Index Scan using notes_u1 on notes n  (cost=0.00..344.23 rows=1 width=12) (actual time=0.049..0.052 rows=0 loops=692)
                                 Index Cond: ((employee_id = e.id) AND (note_time >= ((('now'::text)::date - 1))) AND (note_time <= (((('now'::text)::date - 1)) + '1 day'::interval)))
                                 Filter: ((activity_cd)::text = 'help_draw'::text)
                           ->  Index Scan using loans_m12 on loans l  (cost=0.00..38.94 rows=22 width=8) (actual time=0.033..0.040 rows=2 loops=18)
                                 Index Cond: (customer_id = n.customer_id)

I've tried disabling execution paths to get it to loop join, but no matter what it insists on seqscanning lt_cd. I've verified that simpler queries will use an index scan of loan_id (ie: WHERE loan_id < 999999).

Again, I've got multiple queries exhibiting this problem... this is NOT directly related to this query.

_by: http://explain.depesz.com/s/LYOY

                                                                                                 QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Subquery Scan on a  (cost=843581.41..843600.81 rows=862 width=24) (actual time=64.735..64.735 rows=0 loops=1)
   ->  Unique  (cost=843581.41..843592.19 rows=862 width=20) (actual time=64.735..64.735 rows=0 loops=1)
         ->  Sort  (cost=843581.41..843583.57 rows=862 width=20) (actual time=64.733..64.733 rows=0 loops=1)
               Sort Key: e.login, ((('now'::text)::date - 1)), ((('now'::text)::date - 1)), ((SubPlan 1))
               Sort Method: quicksort  Memory: 25kB
               ->  HashAggregate  (cost=380.35..843539.38 rows=862 width=20) (actual time=64.726..64.726 rows=0 loops=1)
                     Filter: ((SubPlan 2) > 0)
                     ->  Hash Join  (cost=305.29..371.73 rows=862 width=20) (actual time=4.765..5.866 rows=895 loops=1)
                           Hash Cond: (e_r.employee_id = e.id)
                           ->  Nested Loop  (cost=15.18..67.61 rows=862 width=12) (actual time=0.228..0.976 rows=895 loops=1)
                                 ->  Result  (cost=0.00..0.03 rows=1 width=0) (actual time=0.009..0.009 rows=1 loops=1)
                                 ->  Bitmap Heap Scan on employees_roles e_r  (cost=15.18..58.96 rows=862 width=4) (actual time=0.214..0.782 rows=895 loops=1)
                                       Recheck Cond: (role_id = ANY ('{3,23}'::integer[]))
                                       ->  Bitmap Index Scan on employees_roles_m1  (cost=0.00..14.97 rows=862 width=0) (actual time=0.195..0.195 rows=895 loops=1)
                                             Index Cond: (role_id = ANY ('{3,23}'::integer[]))
                           ->  Hash  (cost=247.27..247.27 rows=3427 width=12) (actual time=4.521..4.521 rows=3427 loops=1)
                                 Buckets: 1024  Batches: 1  Memory Usage: 154kB
                                 ->  Seq Scan on employees e  (cost=0.00..247.27 rows=3427 width=12) (actual time=0.021..3.595 rows=3427 loops=1)
                     SubPlan 1
                       ->  Aggregate  (cost=489.06..489.07 rows=1 width=4) (never executed)
                             ->  Nested Loop  (cost=0.00..489.05 rows=1 width=4) (never executed)
                                   ->  Nested Loop  (cost=0.00..485.80 rows=10 width=8) (never executed)
                                         Join Filter: ((jnasby.lt_by.created_on >= (n.note_time - '00:10:00'::interval)) AND (jnasby.lt_by.created_on <= (n.note_time + '1 day'::interval)))
                                         ->  Nested Loop  (cost=0.00..383.44 rows=14 width=16) (never executed)
                                               ->  Index Scan using notes_u1 on notes n  (cost=0.00..344.23 rows=1 width=12) (never executed)
                                                     Index Cond: ((employee_id = e.id) AND (note_time >= ((('now'::text)::date - 1))) AND (note_time <= (((('now'::text)::date - 1)) + '1 day'::interval)))
                                                     Filter: ((activity_cd)::text = 'help_draw'::text)
                                               ->  Index Scan using loans_m12 on loans l  (cost=0.00..38.94 rows=22 width=8) (never executed)
                                                     Index Cond: (customer_id = n.customer_id)
                                         ->  Index Scan using lt_by__loan_id on lt_by  (cost=0.00..6.50 rows=36 width=28) (never executed)
                                               Index Cond: (loan_id = l.id)
                                   ->  Index Scan using by_text_pkey on by_text c  (cost=0.00..0.31 rows=1 width=4) (never executed)
                                         Index Cond: (id = jnasby.lt_by.created_by_id)
                                         Filter: ("substring"((by)::text, '^[CAE]:(.*?) I:'::text) = (e.login)::text)
                     SubPlan 2
                       ->  Aggregate  (cost=489.06..489.07 rows=1 width=4) (actual time=0.083..0.083 rows=1 loops=692)
                             ->  Nested Loop  (cost=0.00..489.05 rows=1 width=4) (actual time=0.080..0.080 rows=0 loops=692)
                                   ->  Nested Loop  (cost=0.00..485.80 rows=10 width=8) (actual time=0.079..0.079 rows=0 loops=692)
                                         Join Filter: ((jnasby.lt_by.created_on >= (n.note_time - '00:10:00'::interval)) AND (jnasby.lt_by.created_on <= (n.note_time + '1 day'::interval)))
                                         ->  Nested Loop  (cost=0.00..383.44 rows=14 width=16) (actual time=0.054..0.059 rows=0 loops=692)
                                               ->  Index Scan using notes_u1 on notes n  (cost=0.00..344.23 rows=1 width=12) (actual time=0.053..0.057 rows=0 loops=692)
                                                     Index Cond: ((employee_id = e.id) AND (note_time >= ((('now'::text)::date - 1))) AND (note_time <= (((('now'::text)::date - 1)) + '1 day'::interval)))
                                                     Filter: ((activity_cd)::text = 'help_draw'::text)
                                               ->  Index Scan using loans_m12 on loans l  (cost=0.00..38.94 rows=22 width=8) (actual time=0.034..0.042 rows=2 loops=18)
                                                     Index Cond: (customer_id = n.customer_id)
                                         ->  Index Scan using lt_by__loan_id on lt_by  (cost=0.00..6.50 rows=36 width=28) (actual time=0.043..0.477 rows=60 loops=28)
                                               Index Cond: (loan_id = l.id)
                                   ->  Index Scan using by_text_pkey on by_text c  (cost=0.00..0.31 rows=1 width=4) (never executed)
                                         Index Cond: (id = jnasby.lt_by.created_by_id)
                                         Filter: ("substring"((by)::text, '^[CAE]:(.*?) I:'::text) = (e.login)::text)
 Total runtime: 65.070 ms
(51 rows)

_cd: http://explain.depesz.com/s/fkc
                                                                                                 QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Subquery Scan on a  (cost=5326007345.03..5326007364.42 rows=862 width=24) (actual time=257838.254..257838.254 rows=0 loops=1)
   ->  Unique  (cost=5326007345.03..5326007355.80 rows=862 width=20) (actual time=257838.254..257838.254 rows=0 loops=1)
         ->  Sort  (cost=5326007345.03..5326007347.18 rows=862 width=20) (actual time=257838.254..257838.254 rows=0 loops=1)
               Sort Key: e.login, ((('now'::text)::date - 1)), ((('now'::text)::date - 1)), ((SubPlan 1))
               Sort Method: quicksort  Memory: 25kB
               ->  HashAggregate  (cost=380.35..5326007303.00 rows=862 width=20) (actual time=257838.246..257838.246 rows=0 loops=1)
                     Filter: ((SubPlan 2) > 0)
                     ->  Hash Join  (cost=305.29..371.73 rows=862 width=20) (actual time=3.916..4.789 rows=895 loops=1)
                           Hash Cond: (e_r.employee_id = e.id)
                           ->  Nested Loop  (cost=15.18..67.61 rows=862 width=12) (actual time=0.208..0.788 rows=895 loops=1)
                                 ->  Result  (cost=0.00..0.03 rows=1 width=0) (actual time=0.006..0.006 rows=1 loops=1)
                                 ->  Bitmap Heap Scan on employees_roles e_r  (cost=15.18..58.96 rows=862 width=4) (actual time=0.191..0.626 rows=895 loops=1)
                                       Recheck Cond: (role_id = ANY ('{3,23}'::integer[]))
                                       ->  Bitmap Index Scan on employees_roles_m1  (cost=0.00..14.97 rows=862 width=0) (actual time=0.173..0.173 rows=895 loops=1)
                                             Index Cond: (role_id = ANY ('{3,23}'::integer[]))
                           ->  Hash  (cost=247.27..247.27 rows=3427 width=12) (actual time=3.696..3.696 rows=3427 loops=1)
                                 Buckets: 1024  Batches: 1  Memory Usage: 154kB
                                 ->  Seq Scan on employees e  (cost=0.00..247.27 rows=3427 width=12) (actual time=0.012..2.967 rows=3427 loops=1)
                     SubPlan 1
                       ->  Aggregate  (cost=3089331.15..3089331.16 rows=1 width=4) (never executed)
                             ->  Hash Join  (cost=16560.08..3089331.14 rows=1 width=4) (never executed)
                                   Hash Cond: (jnasby.lt_cd.loan_id = l.id)
                                   Join Filter: ((jnasby.lt_cd.created_on >= (n.note_time - '00:10:00'::interval)) AND (jnasby.lt_cd.created_on <= (n.note_time + '1 day'::interval)))
                                   ->  Hash Join  (cost=16176.47..3087105.37 rows=491238 width=12) (never executed)
                                         Hash Cond: (jnasby.lt_cd.created_by_id = c.id)
                                         ->  Seq Scan on lt_cd  (cost=0.00..2329065.44 rows=98260144 width=32) (never executed)
                                         ->  Hash  (cost=16131.38..16131.38 rows=3607 width=4) (never executed)
                                               ->  Seq Scan on by_text c  (cost=0.00..16131.38 rows=3607 width=4) (never executed)
                                                     Filter: ("substring"((by)::text, '^[CAE]:(.*?) I:'::text) = (e.login)::text)
                                   ->  Hash  (cost=383.44..383.44 rows=14 width=16) (never executed)
                                         ->  Nested Loop  (cost=0.00..383.44 rows=14 width=16) (never executed)
                                               ->  Index Scan using notes_u1 on notes n  (cost=0.00..344.23 rows=1 width=12) (never executed)
                                                     Index Cond: ((employee_id = e.id) AND (note_time >= ((('now'::text)::date - 1))) AND (note_time <= (((('now'::text)::date - 1)) + '1 day'::interval)))
                                                     Filter: ((activity_cd)::text = 'help_draw'::text)
                                               ->  Index Scan using loans_m12 on loans l  (cost=0.00..38.94 rows=22 width=8) (never executed)
                                                     Index Cond: (customer_id = n.customer_id)
                     SubPlan 2
                       ->  Aggregate  (cost=3089331.15..3089331.16 rows=1 width=4) (actual time=372.589..372.589 rows=1 loops=692)
                             ->  Hash Join  (cost=16560.08..3089331.14 rows=1 width=4) (actual time=372.586..372.586 rows=0 loops=692)
                                   Hash Cond: (jnasby.lt_cd.loan_id = l.id)
                                   Join Filter: ((jnasby.lt_cd.created_on >= (n.note_time - '00:10:00'::interval)) AND (jnasby.lt_cd.created_on <= (n.note_time + '1 day'::interval)))
                                   ->  Hash Join  (cost=16176.47..3087105.37 rows=491238 width=12) (actual time=14610.882..32223.008 rows=2642 loops=8)
                                         Hash Cond: (jnasby.lt_cd.created_by_id = c.id)
                                         ->  Seq Scan on lt_cd  (cost=0.00..2329065.44 rows=98260144 width=32) (actual time=0.018..15123.405 rows=98261588 loops=8)
                                         ->  Hash  (cost=16131.38..16131.38 rows=3607 width=4) (actual time=4519.878..4519.878 rows=20 loops=8)
                                               Buckets: 1024  Batches: 1  Memory Usage: 1kB
                                               ->  Seq Scan on by_text c  (cost=0.00..16131.38 rows=3607 width=4) (actual time=4405.172..4519.861 rows=20 loops=8)
                                                     Filter: ("substring"((by)::text, '^[CAE]:(.*?) I:'::text) = (e.login)::text)
                                   ->  Hash  (cost=383.44..383.44 rows=14 width=16) (actual time=0.055..0.055 rows=0 loops=692)
                                         Buckets: 1024  Batches: 1  Memory Usage: 0kB
                                         ->  Nested Loop  (cost=0.00..383.44 rows=14 width=16) (actual time=0.051..0.055 rows=0 loops=692)
                                               ->  Index Scan using notes_u1 on notes n  (cost=0.00..344.23 rows=1 width=12) (actual time=0.049..0.052 rows=0 loops=692)
                                                     Index Cond: ((employee_id = e.id) AND (note_time >= ((('now'::text)::date - 1))) AND (note_time <= (((('now'::text)::date - 1)) + '1 day'::interval)))
                                                     Filter: ((activity_cd)::text = 'help_draw'::text)
                                               ->  Index Scan using loans_m12 on loans l  (cost=0.00..38.94 rows=22 width=8) (actual time=0.033..0.040 rows=2 loops=18)
                                                     Index Cond: (customer_id = n.customer_id)
 Total runtime: 257838.571 ms
(57 rows)
--
Jim Nasby, Lead Data Architect   (512) 569-9461

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance