At 04:24 AM 2/17/2006, Ragnar wrote:
On fös, 2006-02-17 at 01:20 -0500, Ron wrote:
>
> OK, so here's _a_ way (there are others) to obtain a mapping such that
> if a < b then f(a) < f (b) and
> if a == b then f(a) == f(b)
> By scanning the table once, we can map say 0000001h (Hex used to ease
> typing) to the row with the minimum value and 1111111h to the row
> with the maximum value as well as mapping everything in between to
> their appropriate keys. That same scan can be used to assign a
> pointer to each record's location.
This step is just as expensive as the original
sort you want to replace/improve.
Why do you think that? External sorts involve
the equivalent of multiple scans of the table to
be sorted, sometimes more than lgN (where N is
the number of items in the table to be
sorted). Since this is physical IO we are
talking about, each scan is very expensive, and
therefore 1 scan is going to take considerably
less time than >= lgN scans will be.
If you want to keep this mapping saved as a sort
of an index, or as part ot each row data, this
will make the cost of inserts and updates enormous.
Not sure you've got this right either. Looks to
me like we are adding a <= 32b quantity to each
row. Once we know the mapping, incrementally
updating it upon insert or update would seem to
be simple matter of a fast search for the correct
ranking [Interpolation search, which we have all
the needed data for, is O(lglgN). Hash based
search is O(1)]; plus an increment/decrement of
the key values greater/less than the key value of
the row being inserted / updated. Given than we
are updating all the keys in a specific range
within a tree structure, that update can be done
in O(lgm) (where m is the number of records affected).
> We can now sort the key+pointer pairs instead of the actual data and
> use an optional final pass to rearrange the actual rows if we wish.
How are you suggesting this mapping be accessed?
If the mapping is kept separate from the tuple
data, as in an index, then how will you look up the key?
??? We've effectively created a data set where
each record is a pointer to a DB row plus its
key. We can now sort the data set by key and
then do an optional final pass to rearrange the
actual DB rows if we so wish. Since that final
pass is very expensive, it is good that not all
use scenarios will need that final pass.
The amount of storage required to sort this
representation of the table rather than the
actual table is so much less that it turns an
external sorting problem into a internal sorting
problem with an optional final pass that is =1=
scan (albeit one scan with a lot of seeks and
data movement). This is a big win. It is a
variation of a well known technique. See Sedgewick, Knuth, etc.
> That initial scan to set up the keys is expensive, but if we wish
> that cost can be amortized over the life of the table so we don't
> have to pay it all at once. In addition, once we have created those
> keys, then can be saved for later searches and sorts.
What is the use case where this would work better than a
regular btree index ?
Again, ??? btree indexes address different
issues. They do not in any way help create a
compact data representation of the original data
that saves enough space so as to turn an external
ranking or sorting problem into an internal one.
Ron