Re: HashJoin order, hash the large or small table? Postgres likes to hash the big one, why?

Scott Carey <scott@xxxxxxxxxxxxxxxxx> · Wed, 13 Apr 2011 10:22:40 -0700

New email-client nightmares!  Fixed below.  I think.
-------------

Sorry for resurrecting this thread, but this has been in my outbox for
months and I think it is important:

>On Oct 27, 2010, at 12:56 PM, Tom Lane wrote:
>
>
>> Scott Carey  writes:
>>Why does hashjoin behave poorly when the inner relation is not
>
>>uniformly distributed and the outer is?
>
>
>
> Because a poorly distributed inner relation leads to long hash chains.
> In the very worst case, all the keys are on the same hash chain and it
> degenerates to a nested-loop join.  (There is an assumption in the
> costing code that the longer hash chains also tend to get searched more
> often, which maybe doesn't apply if the outer rel is flat, but it's not
> obvious that it's safe to not assume that.)
>

I disagree.  Either
1:  The estimator is wrong
or
2:  The hash data structure is flawed.

A pathological skew case (all relations with the same key), should be
_cheaper_ to probe.   There should be only _one_ entry in the hash (for
the one key), and that entry will be a list of all relations matching the
key.  Therefore, hash probes will either instantly fail to match on an
empty bucket, fail to match the one key with one compare, or match the one
key and join on the matching list.

In particular for anti-join, high skew should be the best case scenario.

A hash structure that allows multiple entries per key is inappropriate for
skewed data, because it is not O(n).  One that has one entry per key
remains O(n) for all skew.  Furthermore, the hash buckets and # of entries
is proportional to n_distinct in this case, and smaller and more cache and
memory friendly to probe.

>Not really.  It's still searching a long hash chain; maybe it will find
> an exact match early in the chain, or maybe not.  It's certainly not
> *better* than antijoin with a well-distributed inner rel.

There shouldn't be long hash chains.  A good hash function + proper bucket
count + one entry per key = no long chains.

> Although the
> point is moot, anyway, since if it's an antijoin there is only one
> candidate for which rel to put on the outside.

You can put either relation on the outside with an anti-join, but would
need a different algorithm and cost estimator if done the other way
around.  Construct a hash on the join key, that keeps a list of relations
per key, iterate over the other relation, and remove the key and
corresponding list from the hash when there is a match, when complete the
remaining items in the hash are the result of the join (also already
grouped by the key).  It could be terminated early if all entries are
removed.  
This would be useful if the hash was small, the other side of the hash too
large to fit in memory, and alternative was a massive sort on the other
relation.

Does the hash cost estimator bias towards smaller hashes due to hash probe
cost increasing with hash size due to processor caching effects?  Its not
quite O(n) due to caching effects.

>
>>             regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance