On Sat, May 24, 2014 at 05:58:37PM +0100, Jack Douglas wrote: > Would the following be practical to implement: > A btree-like index type that points to *pages* rather than individual rows. > Ie if there are many rows in a page with the same data (in the indexed > columns), only one index entry will exist. In its normal use case, this > index would be much smaller than a regular index on the same columns which > would contain one entry for each individual row. > To reduce complexity (eg MVCC/snapshot related issues), index entries would > be added when a row is inserted, but they would not be removed when the row > is updated/deleted (or when an insert is rolled back): this would cause > index bloat over time in volatile tables but this would be acceptable for > the use case I have in mind. So in essence, an entry in the index would > indicate that there *may* be matching rows in the page, not that there > actually are. It's an interesting idea, but, how can you *ever* delete index entries? I.e. is there a way to maintain the index without rebuilding it regularly? Maybe there's something you could do with tracking all the entries that point to one page or something, or a counter. Because really, the fact that the item pointer in a btree index includes the item number is only really needed for deletion. Postgres always has to read in the whole page anyway, so if you can find a way around that it might be an interesting improvement. Mind you, hash indexes could get this almost free, except they're not crash safe. Have a nice day, -- Martijn van Oosterhout <kleptog@xxxxxxxxx> http://svana.org/kleptog/ > He who writes carelessly confesses thereby at the very outset that he does > not attach much importance to his own thoughts. -- Arthur Schopenhauer
Attachment:
signature.asc
Description: Digital signature