Re: [HACKERS] qsort again (was Re: Strange Create

Ragnar <gnari@xxxxxxx> · Fri, 17 Feb 2006 09:24:21 +0000

On fös, 2006-02-17 at 01:20 -0500, Ron wrote:
> At 01:47 PM 2/16/2006, Ron wrote:
> >At 12:19 PM 2/16/2006, Scott Lamb wrote:
> >>On Feb 16, 2006, at 8:32 AM, Ron wrote:
> >>>Let's pretend that we have the typical DB table where rows are
> >>>~2-4KB apiece.  1TB of storage will let us have 256M-512M rows in
> >>>such a table.
> >>>
> >>>A 32b hash code can be assigned to each row value such that only
> >>>exactly equal rows will have the same hash code.
> >>>A 32b pointer can locate any of the 256M-512M rows.
> >>>
> >>>Now instead of sorting 1TB of data we can sort 2^28 to 2^29 32b 
> >>>+32b= 64b*(2^28 to 2^29)=  2-4GB of pointers+keys followed by an
> >>>optional pass to rearrange the actual rows if we so wish.
> >>
> >>I don't understand this.
> >>
> >>This is a true statement: (H(x) != H(y)) => (x != y)
> >>This is not: (H(x) < H(y)) => (x < y)
> >>
> >>Hash keys can tell you there's an inequality, but they can't tell you
> >>how the values compare. If you want 32-bit keys that compare in the
> >>same order as the original values, here's how you have to get them:
> >For most hash codes, you are correct.  There is a class of hash or 
> >hash-like codes that maintains the mapping to support that second statement.
> >
> >More later when I can get more time.
> >Ron
> 
> OK, so here's _a_ way (there are others) to obtain a mapping such that
>   if a < b then f(a) < f (b) and
>   if a == b then f(a) == f(b)

> By scanning the table once, we can map say 0000001h (Hex used to ease 
> typing) to the row with the minimum value and 1111111h to the row 
> with the maximum value as well as mapping everything in between to 
> their appropriate keys.  That same scan can be used to assign a 
> pointer to each record's location.

This step is just as expensive as the original sort you
want to replace/improve. If you want to keep this mapping
saved as a sort of an index, or as part ot each row data, this will make
the cost of inserts and updates enormous.

> 
> We can now sort the key+pointer pairs instead of the actual data and 
> use an optional final pass to rearrange the actual rows if we wish.

How are you suggesting this mapping be accessed? If the 
mapping is kept separate from the tuple data, as in an index, then how
will you look up the key?

> That initial scan to set up the keys is expensive, but if we wish 
> that cost can be amortized over the life of the table so we don't 
> have to pay it all at once.  In addition, once we have created those 
> keys, then can be saved for later searches and sorts.

What is the use case where this would work better than a 
regular btree index ?

gnari