On Wed, Apr 13, 2011 at 1:22 PM, Scott Carey <scott@xxxxxxxxxxxxxxxxx> wrote: > A pathological skew case (all relations with the same key), should be > _cheaper_ to probe. There should be only _one_ entry in the hash (for > the one key), and that entry will be a list of all relations matching the > key. Therefore, hash probes will either instantly fail to match on an > empty bucket, fail to match the one key with one compare, or match the one > key and join on the matching list. > > In particular for anti-join, high skew should be the best case scenario. I think this argument may hold some water for an anti-join, and maybe for a semi-join, but it sure doesn't seem right for any kind of join that has to iterate over all matches (rather than just the first one); that is, inner, left, right, or full. > A hash structure that allows multiple entries per key is inappropriate for > skewed data, because it is not O(n). One that has one entry per key > remains O(n) for all skew. Furthermore, the hash buckets and # of entries > is proportional to n_distinct in this case, and smaller and more cache and > memory friendly to probe. I don't think this argument is right. The hash table is sized for a load factor significantly less than one, so if there are multiple entries in a bucket, it is fairly likely that they are all for the same key. Granted, we have to double-check the keys to figure that out; but I believe that the data structure you are proposing would require similar comparisons. The only difference is that they'd be required when building the hash table, rather than when probing it. > You can put either relation on the outside with an anti-join, but would > need a different algorithm and cost estimator if done the other way > around. Construct a hash on the join key, that keeps a list of relations > per key, iterate over the other relation, and remove the key and > corresponding list from the hash when there is a match, when complete the > remaining items in the hash are the result of the join (also already > grouped by the key). It could be terminated early if all entries are > removed. > This would be useful if the hash was small, the other side of the hash too > large to fit in memory, and alternative was a massive sort on the other > relation. This would be a nice extension of commit f4e4b3274317d9ce30de7e7e5b04dece7c4e1791. > Does the hash cost estimator bias towards smaller hashes due to hash probe > cost increasing with hash size due to processor caching effects? Its not > quite O(n) due to caching effects. I don't think we account for that (and I'm not convinced we need to). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance