Hi,
I am testing things on hash aggregate spilling in version 13.1 and am struggling to understand the partition thing in the two explains below.
My understanding is that a partition corresponds to a spill file which will be treated in a later batch (which can re-spill in some cases).
Am I right to think that the second explain analyze says that PostgreSQL was planning for 8 batches (there are 8 planned partitions) and that only one was necessary (= no spill files) ?
regards
benoit
[local]:5433 postgres@postgres=# CREATE TABLE tableA(ac1 int, ac2 int);
CREATE TABLE
[local]:5433 postgres@postgres=# CREATE TABLE tableB(bc1 int, bc2 int);
CREATE TABLE
[local]:5433 postgres@postgres=# INSERT INTO tableA SELECT x, random()*100 FROM generate_series(1,1000000) AS F(x);
INSERT 0 1000000
[local]:5433 postgres@postgres=# INSERT INTO tableB SELECT mod(x,100000), random()*100 FROM generate_series(1,1000000) AS F(x);
INSERT 0 1000000
CREATE TABLE
[local]:5433 postgres@postgres=# CREATE TABLE tableB(bc1 int, bc2 int);
CREATE TABLE
[local]:5433 postgres@postgres=# INSERT INTO tableA SELECT x, random()*100 FROM generate_series(1,1000000) AS F(x);
INSERT 0 1000000
[local]:5433 postgres@postgres=# INSERT INTO tableB SELECT mod(x,100000), random()*100 FROM generate_series(1,1000000) AS F(x);
INSERT 0 1000000
[local]:5433 postgres@postgres=# SELECT name, setting, unit FROM pg_settings WHERE name IN('work_mem', 'hash_mem_multiplier');
name | setting | unit
---------------------+---------+------
hash_mem_multiplier | 1 | NULL
work_mem | 4096 | kB
(2 rows)
[local]:5433 postgres@postgres=# EXPLAIN (ANALYZE) SELECT ac1, count(ac2), sum(bc2) FROM tableA INNER JOIN TABLEB ON ac1 = bc1 GROUP BY Ac1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=137290.50..157056.12 rows=1000000 width=20) (actual time=773.405..889.020 rows=99999 loops=1)
Group Key: tablea.ac1
Planned Partitions: 32 Batches: 33 Memory Usage: 4369kB Disk Usage: 30456kB
-> Hash Join (cost=30832.00..70728.00 rows=1000000 width=12) (actual time=158.774..583.031 rows=999990 loops=1)
Hash Cond: (tableb.bc1 = tablea.ac1)
-> Seq Scan on tableb (cost=0.00..14425.00 rows=1000000 width=8) (actual time=0.023..77.297 rows=1000000 loops=1)
-> Hash (cost=14425.00..14425.00 rows=1000000 width=8) (actual time=158.378..158.379 rows=1000000 loops=1)
Buckets: 131072 Batches: 16 Memory Usage: 3471kB
-> Seq Scan on tablea (cost=0.00..14425.00 rows=1000000 width=8) (actual time=0.010..53.476 rows=1000000 loops=1)
Planning Time: 0.824 ms
Execution Time: 895.251 ms
(11 rows)
[local]:5433 postgres@postgres=# SET hash_mem_multiplier TO 5;
SET
[local]:5433 postgres@postgres=# EXPLAIN (ANALYZE) SELECT ac1, count(ac2), sum(bc2) FROM tableA INNER JOIN TABLEB ON ac1 = bc1 GROUP BY Ac1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=137290.50..157056.12 rows=1000000 width=20) (actual time=696.684..714.198 rows=99999 loops=1)
Group Key: tablea.ac1
Planned Partitions: 8 Batches: 1 Memory Usage: 15633kB
-> Hash Join (cost=30832.00..70728.00 rows=1000000 width=12) (actual time=171.789..560.692 rows=999990 loops=1)
Hash Cond: (tableb.bc1 = tablea.ac1)
-> Seq Scan on tableb (cost=0.00..14425.00 rows=1000000 width=8) (actual time=0.032..78.718 rows=1000000 loops=1)
-> Hash (cost=14425.00..14425.00 rows=1000000 width=8) (actual time=168.592..168.593 rows=1000000 loops=1)
Buckets: 524288 Batches: 4 Memory Usage: 13854kB
-> Seq Scan on tablea (cost=0.00..14425.00 rows=1000000 width=8) (actual time=0.018..52.796 rows=1000000 loops=1)
Planning Time: 0.242 ms
Execution Time: 717.914 ms
(11 rows)
name | setting | unit
---------------------+---------+------
hash_mem_multiplier | 1 | NULL
work_mem | 4096 | kB
(2 rows)
[local]:5433 postgres@postgres=# EXPLAIN (ANALYZE) SELECT ac1, count(ac2), sum(bc2) FROM tableA INNER JOIN TABLEB ON ac1 = bc1 GROUP BY Ac1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=137290.50..157056.12 rows=1000000 width=20) (actual time=773.405..889.020 rows=99999 loops=1)
Group Key: tablea.ac1
Planned Partitions: 32 Batches: 33 Memory Usage: 4369kB Disk Usage: 30456kB
-> Hash Join (cost=30832.00..70728.00 rows=1000000 width=12) (actual time=158.774..583.031 rows=999990 loops=1)
Hash Cond: (tableb.bc1 = tablea.ac1)
-> Seq Scan on tableb (cost=0.00..14425.00 rows=1000000 width=8) (actual time=0.023..77.297 rows=1000000 loops=1)
-> Hash (cost=14425.00..14425.00 rows=1000000 width=8) (actual time=158.378..158.379 rows=1000000 loops=1)
Buckets: 131072 Batches: 16 Memory Usage: 3471kB
-> Seq Scan on tablea (cost=0.00..14425.00 rows=1000000 width=8) (actual time=0.010..53.476 rows=1000000 loops=1)
Planning Time: 0.824 ms
Execution Time: 895.251 ms
(11 rows)
[local]:5433 postgres@postgres=# SET hash_mem_multiplier TO 5;
SET
[local]:5433 postgres@postgres=# EXPLAIN (ANALYZE) SELECT ac1, count(ac2), sum(bc2) FROM tableA INNER JOIN TABLEB ON ac1 = bc1 GROUP BY Ac1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=137290.50..157056.12 rows=1000000 width=20) (actual time=696.684..714.198 rows=99999 loops=1)
Group Key: tablea.ac1
Planned Partitions: 8 Batches: 1 Memory Usage: 15633kB
-> Hash Join (cost=30832.00..70728.00 rows=1000000 width=12) (actual time=171.789..560.692 rows=999990 loops=1)
Hash Cond: (tableb.bc1 = tablea.ac1)
-> Seq Scan on tableb (cost=0.00..14425.00 rows=1000000 width=8) (actual time=0.032..78.718 rows=1000000 loops=1)
-> Hash (cost=14425.00..14425.00 rows=1000000 width=8) (actual time=168.592..168.593 rows=1000000 loops=1)
Buckets: 524288 Batches: 4 Memory Usage: 13854kB
-> Seq Scan on tablea (cost=0.00..14425.00 rows=1000000 width=8) (actual time=0.018..52.796 rows=1000000 loops=1)
Planning Time: 0.242 ms
Execution Time: 717.914 ms
(11 rows)