Dimitrios Apostolou <jimis@xxxxxxx> writes: > Please correct me if I'm wrong, as I'm a newcomer to PostgreSQL, but here > is how I understand things according to posts I've read, and classical > algorithms: > + The Hash Join is fastest when one side fits in work_mem. Then on one > hand you have a hash table lookup (amortized O(1)) and on the other > hand, if the table has M rows that that do not fit in memory, you have > sequential reads from the disk (given low fragmentation of the table or > index files): For every line you read from the disk, you lookup the key > in the hash table. > If the hash table does not fit in RAM then the cost becomes prohibitive. > Every lookup is a random access possibly hitting the disk. The total > cost should be random_page_cost * M. That would be true of a simple hash join, but Postgres uses batched hash joins: we split up the hash key space into subsets, where hopefully each subset includes few enough inner-side rows to fit into work_mem. While this can go wrong given pathological distribution of the inner-side keys, it does mean that the join can perform well even when the inner side is much larger than work_mem. So it's not the case that the planner will simply disregard hash joins beyond work_mem. It will apply a cost penalty for the predicted batching overhead; but that can still come out cheaper than merge join, because the sorting needed for merge is generally also far from cheap. > So I would expect an increased random_page_cost to benefit the Merge Join > algorithm. And since my setup involves spinning disks, it does makes sense > to increase it. What is probably really happening is that random_page_cost affects the estimated cost of performing the sort using an index scan instead of a bespoke sort step. AFAIR, cost_sort doesn't consider random_page_cost at all, and neither does cost_hashjoin. regards, tom lane