> The discussions at PGCon pointed out that with the posting-list compression logic added in 9.4, GIN indexes are pretty close to this already. Multiple items on the same heap page will typically only take one byte of index space per item; but there is an identifiable entry, so you don't get into these questions of when VACUUM should remove entries, and it's not lossy so you're not forced to pay the overhead of rechecking every entry on the linked-to page. > Not to say that 9.4 GIN is necessarily the last word on the subject, but it would be worth testing it out before deciding that we need something better. (beta1 is out. It needs testing. Hint hint.) Hint taken, and first impressions are positive: the compression is very efficient for the kind of scenario I'm imagining where the key is deliberately chosen so that the average page has one distinct key. I have a 25Mb gin 'cluster' index on a table where an equivalent regular btree index is 10 times as large. So the questions are, a) is this kind of clustering broadly useful (ie not just to me), b) how much effort will it be to implement a 'vacuum-like' operation that scans a designated index and performs the relevant delete/inserts to achieve this kind of clustering? And c) if it is broadly useful and not a major implementation mountain to climb, is it something that might be added to the todo list? If someone can tell me how to decode a `ctid` into a page number (discarding the row number portion - is there a better way than ` (replace(replace(ctid::text,'(','{'),')','}')::integer[])[1]`), I should be able to show some analysis demonstrating this working, albeit inefficiently as I'll have to scan the table itself for the page/key statistics. Would that sort of analysis be helpful? Kindest regards Jack PS It occurs to me that the btree_gin documentation page for 9.4, http://www.postgresql.org/docs/9.4/static/btree-gin.html, might benefit from including some mention of index compression when discussing the relative performance of regular and gin btree indexes.