Search Postgresql Archives

Re: Why is a hash join preferred when it does not fit in work_mem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dimitrios Apostolou <jimis@xxxxxxx> writes:
> Please correct me if I'm wrong, as I'm a newcomer to PostgreSQL, but here
> is how I understand things according to posts I've read, and classical
> algorithms:

> + The Hash Join is fastest when one side fits in work_mem. Then on one
>    hand you have a hash table lookup (amortized O(1)) and on the other
>    hand, if the table has M rows that that do not fit in memory, you have
>    sequential reads from the disk (given low fragmentation of the table or
>    index files):  For every line you read from the disk, you lookup the key
>    in the hash table.

>    If the hash table does not fit in RAM then the cost becomes prohibitive.
>    Every lookup is a random access possibly hitting the disk. The total
>    cost should be random_page_cost * M.

That would be true of a simple hash join, but Postgres uses batched
hash joins: we split up the hash key space into subsets, where hopefully
each subset includes few enough inner-side rows to fit into work_mem.
While this can go wrong given pathological distribution of the inner-side
keys, it does mean that the join can perform well even when the inner
side is much larger than work_mem.  So it's not the case that the planner
will simply disregard hash joins beyond work_mem.  It will apply a cost
penalty for the predicted batching overhead; but that can still come out
cheaper than merge join, because the sorting needed for merge is generally
also far from cheap.

> So I would expect an increased random_page_cost to benefit the Merge Join
> algorithm. And since my setup involves spinning disks, it does makes sense
> to increase it.

What is probably really happening is that random_page_cost affects the
estimated cost of performing the sort using an index scan instead of
a bespoke sort step.  AFAIR, cost_sort doesn't consider random_page_cost
at all, and neither does cost_hashjoin.

			regards, tom lane





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux