Re: Given a set of daterange, finding the continuous range that includes a particular date

Adrian Klaver <adrian.klaver@xxxxxxxxxxx> · Thu, 22 Feb 2018 21:05:41 -0800

On 02/22/2018 05:23 PM, Ken Tanzer wrote:

On Thu, Feb 22, 2018 at 5:05 PM, Adrian Klaver 
<adrian.klaver@xxxxxxxxxxx <mailto:adrian.klaver@xxxxxxxxxxx>> wrote:

    On 02/22/2018 04:58 PM, Ken Tanzer wrote:

        On Thu, Feb 22, 2018 at 4:53 PM, Adrian Klaver
        <adrian.klaver@xxxxxxxxxxx <mailto:adrian.klaver@xxxxxxxxxxx>
        <mailto:adrian.klaver@xxxxxxxxxxx
        <mailto:adrian.klaver@xxxxxxxxxxx>>> wrote:

             On 02/22/2018 04:44 PM, Ken Tanzer wrote:

                 Hi, hoping to get some help with this.  I'm needing to
        take a
                 specific date, a series of dateranges  and, given a
        specific
                 date, return a single conitinuous daterange that
        includes that date.

                 To elaborate a bit, I've got lots of tables that
        include start
                 and end dates.  For example:

                 CREATE TABLE tbl_staff_assign (
                       staff_assign_id         SERIAL PRIMARY KEY,
                       client_id               INTEGER NOT NULL REFERENCES
                 tbl_client (client_id),
                       staff_id                INTEGER REFERENCES
                 tbl_staff(staff_id),
                       staff_assign_type_code      VARCHAR(10) NOT NULL
                 REFERENCES tbl_l_staff_assign_type
        (staff_assign_type_code),
                       staff_assign_date           DATE NOT NULL,
                       staff_assign_date_end       DATE,
                 ...

                 So a client might leave a progrma and then return
        later, or they
                 might simply switch to another staff_id.  (In which
        case one
                 record will have and end date, and the next record will
        start on
                 the next day.)  In this case I need to know "what
        period were
                 they continuously in the program that includes X date?"
          So I'd
                 like to be able to do something like:

                 "SELECT staff_assign_date,continuous_daterange(
                 staff_assign_date, (SELECT

        array_agg(daterange(staff_assign_date,staff_assign_date_end,'[]')
                 ) FROM staff_assign sa2 WHERE
        sa2.client_id=sa.client_id) FROM
                 staff_assign sa

                 I've done this before with procedures specific to a
        particular
                 table, and working with the start and end dates.  I'm now
                 wanting to try to do this once generically that will
        work for
                 all my cases.  So I'm hoping to do this in a way that
                 performance isn't horrible.  And it's a little unclear
        to me how
                 much and how I might be able to use the daterange
        operators to
                 accomplish this efficiently.

             The operator I use to solve similar problems:

        https://www.postgresql.org/docs/10/static/functions-range.html
        <https://www.postgresql.org/docs/10/static/functions-range.html>

        <https://www.postgresql.org/docs/10/static/functions-range.html
        <https://www.postgresql.org/docs/10/static/functions-range.html>>

             @>      contains element       
        '[2011-01-01,2011-03-01)'::tsrange
             @> '2011-01-10'::timestamp   t

        Thanks Adrian.  But how would you apply that to this situation,
        where I have a series of (quite possibly discontinuous) dateranges?

    This is going to depend on a more formal definition of the problem
    with some sample data. Right now I am trying to reconcile "what
    period were they continuously in the program that includes X date?"
    with "where I have a series of (quite possibly discontinuous)
    dateranges? " Maybe its just me,  I can't see how discontinuous can
    also be continuously.

*Start   End*
-- Episode 1
1/1/16  3/30/16
4/1/16  4/30/16
-- Episode 2
1/1/18  1/31/18
2/1/18  NULL

Given 1/15/18, should return 1/1/18-->NULL
Given 1/15/16, should return 1/1/16-->4/30/16

Just thinking out loud here, in a function:

1) For a client_id you can find the min(staff_assign_date).

2) You can create a max(staff_assign_date_end) by using 
COALESCE(staff_assign_date_end, current_date)

3) You now have the outer range for the episodes.

4) In that range of dates for each staff_assign_date you find the 
staff_assign_date_end that greater then the staff_assign_date but less 
then the next staff_assign_date. That will give you your episodes.

5) You can then test to see if the X date is in the calculated episodes.

6) As a shortcut you can find min(staff_assign_date) that is less then 
the X date and see if the staff_assign_date_end is greater then the X date.

Thinking about this a bit more, I'm wondering if a window function could 
be used for this?  I've used them a bit, but definitely never wrote one. 
   Something like

continuous_daterange(staff_assign_date,daterange(staff_assign_date,staff_assign_date_end,'[[]') 
OVER (PARTION BY client_id)

But then a window function can only see the rows included in the query, 
yes?  In which case this would only work if you queried for all the rows 
for a particular client_id?  I guess in the example I gave I was doing 
that anyway, so maybe this would be no worse.

Ken
--
AGENCY Software
A Free Software data system
By and for non-profits
/http://agency-software.org//
/https://demo.agency-software.org/client/
ken.tanzer@xxxxxxxxxxxxxxxxxxx <mailto:ken.tanzer@xxxxxxxxxxxxxxxxxxx>
(253) 245-3801

Subscribe to the mailing list 
<mailto:agency-general-request@xxxxxxxxxxxxxxxxxxxxx?body=subscribe> to
learn more about AGENCY or
follow the discussion.

--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx