Re: Group by range in hour of day

John W Higgins <wishdev@xxxxxxxxx> · Mon, 16 Mar 2015 16:02:33 -0700

Assuming 3 things
Table name - test
Column names - start_time, end_time
Added an id column (int) to distinguish each record in the table

You can go with this..... (my apologies for formatting issues)

with
slots as (
    select  *
    from    generate_series(0,1439) as s(slot)
),
slots_hours as (
    select  slot,
            slot / 60 as hour
    from slots
),
minutes as (
    select  id,
            date_part('hour', start_time) * 60 + date_part('minute', start_time) as start_minute,
            date_part('hour', end_time) * 60 + date_part('minute', end_time) as end_minute
    from    test
),
minute_slots as (
    select  id,
            slot,
            hour
    from    minutes
    join    slots_hours
        on  minutes.start_minute <= slots_hours.slot
        and minutes.end_minute > slots_hours.slot
)
select      hour,
            count(*) / 60.0 as sum,
            count(distinct id) as count
from        minute_slots
group by    hour

I'm certain there are more elegant solutions possible - but you can grasp each step this way.

John

On Mon, Mar 16, 2015 at 2:57 PM, Israel Brewster <israel@xxxxxxxxxxxxxx> wrote:
I have a table with two timestamp columns for the start time and end time of each record (call them start and end).I'm trying to figure out if there is a way to group these records by "hour of day", that is the record should be included in the group if the hour of the day for the group falls anywhere in the range [start,end]. Obviously each record may well fall into multiple groups under this scenario.
The goal here is to figure out, for each hour of the day, a) what is the total number of "active" records for that hour, and b) what is the total "active" time for those records during the hour, with an ultimate goal of figuring out the average active time per record per hour.

So, for simplified example, if the table contained three records:

         start              |               end
-----------------------------------------------------
2015-03-15 08:15  |  2015-03-15 10:45
2015-03-15 09:30  |  2015-03-15 10:15
2015-03-15 10:30  |  2015-03-15 11:30

Then the results should break out something like this:

hour  |  count  |  sum
-----------------------------
8       |    1       |   0.75
9       |    2       |   1.5
10     |    3       |   1.5
11     |    1       |   0.5

I can then easily manipulate these values to get my ultimate goal of the average, which would of course always be less than or equal to 1. Is this doable in postgress? Or would it be a better idea to simply pull the raw data and post-process in code? Thanks.

-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
-----------------------------------------------