On Fri, Dec 26, 2014 at 11:41 PM, Merlin Moncure <mmoncure@xxxxxxxxx> wrote:
> On Fri, Dec 26, 2014 at 1:19 PM, Dan S <strd911@xxxxxxxxx> wrote:
>> Well I'm trying to implement a window-function that works on range_types and
>> produces 'atomic ranges' for each input range.
>> Let's say I have a set of ranges some overlapping some not, and I want to
>> split each range at every boundary of every overlapping range and return
>> those.
>> So for each range r I want to return an array of ranges that consists of
>> range r split at every overlapping range boundary.
>> I need to consider both upper and lower boundaries and to be able to do this
>> in one pass over the data I need to sort both the upper and lower boundaries
>> and also sort the original ranges so I can loop over the ranges and
>> boundaries in lockstep to produce the arrays.
>> As a last step I sort back the arrays of atomic ranges in the original order
>> in the window so I can read out the tuplesort in that order and return each
>> array to its corresponding range r.
>>
>> (The result can be used to answer questions like what are the maximum number
>> of simultaneously overlapping ranges and at which ranges the maximum occurs)
shoot -- sorry for empty mail -- misclick.
anyways, can you give some precise examples of what you want to occur?
for example, a set of sql taking actual inputs and the output you
would like to see...
merlin
Say I have theses ranges (in a table or as output of a sql statement):
[2000-01-01,2000-05-01)
[2000-02-01,2003-07-01)
[2002-06-01,2003-03-01)
[2003-04-01,2004-08-01)
then for the first range [2000-01-01,2000-05-01) I want the array output :
{[2000-01-01,2000-02-01),[2000-02-01,2000-05-01)}
for the next range [2000-02-01,2003-07-01) I want the array output:
{[2000-02-01,2000-05-01),[2000-05-01,2002-06-01),[2002-06-01,2003-03-01),[2003-03-01,2003-04-01),[2003-04-01,2003-07-01)}
and for the third range [2002-06-01,2003-03-01) I want the output:
{[2002-06-01,2003-03-01)}
and for the last range [2003-04-01,2004-08-01) the output should be:
{[2003-04-01,2003-07-01),[2003-07-01,2004-08-01)}
As you can see each input range is split up at each boundary where it is overlapped by the other ranges in the set.
I have written a window function which does this.
On
the first call into the window function it calculates all the split up
ranges and stores the resulting arrays into a tuplesort and sort these
back into the original order in the window function.
On subsequent calls into the window function it pulls one array of ranges at a time from the tuplesort and returns that as output.
Then on the last call into the window function I need to release the tuplesort so to not leak resources.
The
problem is that I can't determine which is the last call into the
function so I can't release the tuplesort at the appropriate time.
I need a way to determine when to release the tuplesort, for example a way to register a callback at appropriate time.
Sql wise the table can look like this and say I have the 4 ranges above in the table.
create table t1 (id int, dr daterange);
and the query with my window function: select id, dr, window_range_split(dr) over () from t1
this
query will exhaust the whole window and I can determine when to release
the tuplesort, I release it when I pull out the last tuple in the
window.
However this query won't work: select id, dr, window_range_split(dr) over () from t1 limit 2
the
number of tuples in the window is 4 and I have no idea that there are
only going to be 2 calls into the window function and so I fail release
the tuplesort on the second call.
I should also mention that it's not specifically dateranges I'm targeting. the window function takes anyrange as argument type and returns anyarray.
Best Regards
Dan S