Re: Clustered index to preserve data locality in a multitenant application?

Nicolas Grilly <nicolas@xxxxxxxxxxxxxxxx> · Wed, 31 Aug 2016 17:40:51 +0200

Mike Sofen wrote: 
For Nicolas’s situation, that would require 10,000 partitions – not very useful, and each partition would be very small.

This is exactly my conclusion about using partitions in my situation.

In Postgres, as you mentioned, clustering is a “one time” operation but only in the sense that after you add more rows, you’ll need to re-cluster the table.  Depending on the activity model for that table, that may be feasible/ok.  For example, if you load it via regular batch scripts, then the clustering could be done after those loads.  If you add rows only rarely but then do lots of updates, then the clustering would work great.  If this is an active real time data table, then clustering would not be viable.

The application is very interactive and news rows are inserted all the time in my use case.

Thanks for your time,

Nicolas