Re: new index type with clustering in mind.

"Jack Douglas" <jack@xxxxxxxxxxxxxxxxxxxxxxx> · Tue, 3 Jun 2014 08:24:58 +0100

> > > To reduce complexity (eg MVCC/snapshot related issues), index entries 
> > > would be added when a row is inserted, but they would not be removed 
> > > when the row is updated/deleted (or when an insert is rolled back): 

> > It's an interesting idea, but, how can you *ever* delete index entries?
> > I.e. is there a way to maintain the index without rebuilding it
> > regularly?
> 
> The discussions at PGCon pointed out that with the posting-list
compression
> logic added in 9.4, GIN indexes are pretty close to this already.
Multiple
> items on the same heap page will typically only take one byte of index
space
> per item; but there is an identifiable entry, so you don't get into these
> questions of when VACUUM should remove entries, and it's not lossy so
> you're not forced to pay the overhead of rechecking every entry on the
> linked-to page.

The README file in the source code for GIN indexes says: "Note: There is no
delete operation in the key (entry) tree. The reason for this is that in our
experience, the set of distinct words in a large corpus changes very slowly.
This greatly simplifies the code and concurrency algorithms.". Does that
mean that for the case where GIN is used as a simple replacement for btree,
my initial suggestion above ("...would be added when a row is inserted, but
they would not be removed...") is effectively what happens already with GIN
indexes?

---

I've written up a test to demonstrate the principle of this 'clustering
lite': http://dba.stackexchange.com/q/66292/1396

In brief, it shows the benefit of this sort of clustering (much lower io
versus unclustered, and no exclusive lock or heavy WAL generation versus
current clustering implementations) with a workable way of achieving it in
the current release. The basic idea is:

1. turn off autovacuum for the table
2. check each block to determine the degree of clustering
3. delete and re-insert all the rows from blocks below a clustering
threshold
4. manually vacuum to free those (complete) blocks
5. repeat steps 2-4 as regularly as necessary

A weakness is that is requires a full table scan is required. All that is
needed to avoid the full-scan would be a way of getting `blkno` from an
index-only scan, which as far as I can tell is not currently possible, is
it?