On Tue, 9 Jun 2020 at 01:07, Ron <ronljohnsonjr@xxxxxxxxx> wrote: > > On 6/8/20 3:40 AM, Oleksandr Shulgin wrote: > [snip] > > I've found the original commit adding this feature in version 11: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=1aba8e651ac3e37e1d2d875842de1e0ed22a651e > It says: > > "Hash partitioning is useful when you want to partition a growing data > set evenly. This can be useful to keep table sizes reasonable, which > makes maintenance operations such as VACUUM faster, or to enable > partition-wise join." > > > How does hashed (meaning "randomly?) distribution of records make partition-wise joins more efficient? Hash partitioning certainly does not mean putting the tuple in some random partition. It means putting the tuple in the partition with the correct remainder value after dividing the hash value by the largest partition modulus. > Or -- since I interpret that as having to do with "locality of data" -- am I misunderstanding the meaning of "partition-wise joins"? If it was not a partitioned table before then partition-wise joins wouldn't be possible. Having partition-wise joins could make joining two identically partitioned tables faster. We need only look in the corresponding partition on the other side of the join for join partners for each tuple. For hash joins, hash tables can be smaller, which can mean not having to batch, and possibly having the hash table fit better into a CPU cache. For merge joins, sorts, having the data partially pre-sorted in chunks means fewer operations for qsort which can result in speedups. David