Using the indexing and sampling APIs to realize progressive features

<hohenstein@xxxxxxxxxxxx> · Thu, 3 Feb 2022 16:24:54 +0100

Hi, 

I have some questions regarding the indexing and sampling API.
My aim is to implement a variant of progressive indexing as seen in this paper (link). To summarize, 
I want to implement a variant of online aggregation, where an aggregate query (Like Sum, Average, etc.) is answered in real time, where the result becomes more and more accurate as Tuples are consumed. 
I thought that I could maybe use a custom sampling routine to consume table samples until I have seen the whole table with no duplicate tuples. 
Meanwhile, with every consumed sample and returned partial answer, I want to add the tuples consumed to a progressively evolving index.
This would mean that I would have to be able to uniquely identify each row to be able to add them to the growing index, right? Since OID is deprecated / phased out, I am still unsure of how to solve this. 
Does this sound reasonable or is there an obvious flaw in my thinking?
I would also be thankful if there is any material beyond the Postgres documentation which helps me to start out modifying the source to realize something like this.

Regards
Michael H.