Search Postgresql Archives

Re: [PERFORM] "Hash index" vs. "b-tree index" (PostgreSQL

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Is there a TODO anywhere in this discussion?  If so, please let me know.

---------------------------------------------------------------------------

Mischa Sandberg wrote:
> Quoting Mark Lewis <mark.lewis@xxxxxxxx>:
> 
> > If the original paper was published in 1984, then it's been more than
> > 20 years.  Any potential patents would already have expired, no?
> 
> Don't know, but the idea is pervasive among different vendors ...
> perhaps that's a clue. 
> 
> And having now read beyond the start of ExecHashJoin(), I can see that
> PG does indeed implement Grace hash; and the implementation is nice and
> clean.
> 
> If there were room for improvement, (and I didn't see it in the source)
> it would be the logic to:
> 
> - swap inner and outer inputs (batches) when the original inner turned
> out to be too large for memory, and the corresponding outer did not. If
> you implement that anyway (complicates the loops) then it's no trouble
> to just hash the smaller of the two, every time; saves some CPU.
> 
> - recursively partition batches where both inner and outer input batch
> ends up being too large for memory, too; or where the required number of
> batch output buffers alone is too large for working RAM. This is only
> for REALLY big inputs.
> 
> Note that you don't need a bad hash function to get skewed batch sizes;
> you only need a skew distribution of the values being hashed.
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@xxxxxxxxxxxxxx so that your
>       message can get through to the mailing list cleanly
> 

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@xxxxxxxxxxxxxxxx               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
      joining column's datatypes do not match

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux