Re: [HACKERS] qsort again (was Re: Strange Create

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



At 12:19 PM 2/16/2006, Scott Lamb wrote:
On Feb 16, 2006, at 8:32 AM, Ron wrote:
Let's pretend that we have the typical DB table where rows are
~2-4KB apiece.  1TB of storage will let us have 256M-512M rows in
such a table.

A 32b hash code can be assigned to each row value such that only
exactly equal rows will have the same hash code.
A 32b pointer can locate any of the 256M-512M rows.

Now instead of sorting 1TB of data we can sort 2^28 to 2^29 32b +32b= 64b*(2^28 to 2^29)= 2-4GB of pointers+keys followed by an
optional pass to rearrange the actual rows if we so wish.

I don't understand this.

This is a true statement: (H(x) != H(y)) => (x != y)
This is not: (H(x) < H(y)) => (x < y)

Hash keys can tell you there's an inequality, but they can't tell you
how the values compare. If you want 32-bit keys that compare in the
same order as the original values, here's how you have to get them:
For most hash codes, you are correct. There is a class of hash or hash-like codes that maintains the mapping to support that second statement.

More later when I can get more time.
Ron



[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux