Hi, I have some questions regarding the indexing and sampling API. My aim is to implement a variant of progressive indexing as seen in this paper (link). To summarize, I want to implement a variant of online aggregation, where an aggregate query (Like Sum, Average, etc.) is answered in real time, where the result becomes more and more accurate as Tuples are consumed. I thought that I could maybe use a custom sampling routine to consume table samples until I have seen the whole table with no duplicate tuples. Meanwhile, with every consumed sample and returned partial answer, I want to add the tuples consumed to a progressively evolving index. This would mean that I would have to be able to uniquely identify each row to be able to add them to the growing index, right? Since OID is deprecated / phased out, I am still unsure of how to solve this. Does this sound reasonable or is there an obvious flaw in my thinking? I would also be thankful if there is any material beyond the Postgres documentation which helps me to start out modifying the source to realize something like this. Regards Michael H. |