Hi All, I have a question about hash joins and the meaning of the width in the explain output for a query. I have two large tables table1 has 1 million rows and table2 has 600 million rows When I try to join these two tables based using one constraint on table1 (which reduces the candidate rows down to 660,888) the optimizer seems to correctly choose to hash the values from table1. explain select count(1) from table1 g join table2 x on x.granuleid = g.granuleid where g.collectionid = 22467; QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=18200480.80..18200480.81 rows=1 width=8) -> Hash Join (cost=103206.82..18190602.43 rows=3951347 width=0) Hash Cond: (x.granuleid = g.granuleid) -> Seq Scan on table2 x (cost=0.00..10596253.01 rows=644241901 width=8) -> Hash (cost=92363.72..92363.72 rows=660888 width=8) -> Index Only Scan using idx_table1 on table1 g (cost=0.57..92363.72 rows=660888 width=8) Index Cond: (collectionid = '22467'::bigint) (7 rows) My question is, what gets put into the Hash? I assume the with "width=8" must refer to the size of the key. Does the entire row get copied to the hash as the corresponding value? The reason I ask is because, when I try to run the query it fails due to temp file use over 10GB. How do I accurately determine the amount of memory that will be used. select count(1) from table1 g join table2 x on x.granuleid = g.granuleid where g.collectionid = 22467; ERROR: temporary file size exceeds temp_file_limit (10485760 kB) Thanks in advance, Bob |