Search Postgresql Archives

Re: Clustered index to preserve data locality in a multitenant application?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Sep 1, 2016 at 3:08 PM, Igor Neyman <ineyman@xxxxxxxxxxxxxx> wrote:

Don’t know about plans to implement clustered indexes in PostgreSQL.


It was discussed on the mailing list in the past. 

I found an interesting thread dated from 2012 about integrating pg_reorg (the ancestor of pg_repack) in PostgreSQL core:

https://www.postgresql.org/message-id/flat/CAB7nPqTGmNUFi%2BW6F1iwmf7J-o6sY%2Bxxo6Yb%3DmkUVYT-CG-B5A%40mail.gmail.com

There is also an item titled "Automatically maintain clustering on a table" in the TODO list:

https://wiki.postgresql.org/wiki/Todo#CLUSTER
 

Not sure if this was mentioned, MS SQL Server has clustered indexes, where heap row is just stored on the leaf level of the index.

Oracle also has similar feature: IOT, Index Organized Table.

 

It seems to me (may be I’m wrong), that in PostgreSQL it should be much harder to implement clustered index (with the heap row stored in the index leaf) because of the way how MVCC implemented: multiple row versions are stored in the table itself (e.g. Oracle for that purpose keeps table “clean” and stores multiple row versions in UNDO tablespace/segment).


DB2, like PostgreSQL, stores rows in a heap, and not in the leafs of a Btree. But it's possible to define a "clustering" key for a table. When it is defined, DB2 tries to keep the rows in the heap ordered according to the clustering key. If DB2 can’t find space on the page where the row should go, then it looks a few pages before and after and puts it there, and if it still can’t find space then puts it at the end. There is also a feature called "multidimensional clustering" which is even more sophisticated. There is also a command REORG, which would be the equivalent of a non-blocking CLUSTER in PostgreSQL.

I think DB2's approach is interesting because it shows that maintaining spatial coherency is possible with a heap, without having to store rows in a Btree (like InnoDB).

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux