Re: Summing activity intervals without any obvious column to group by

Carey Tilden <carey.tilden@xxxxxxxxx> · Mon, 13 Aug 2012 18:01:05 -0700

On Mon, Aug 13, 2012 at 5:53 PM, Andrew Sullivan <ajs@xxxxxxxxxxxxxxx> wrote:

On Mon, Aug 13, 2012 at 05:28:24PM -0700, Carey Tilden wrote:

> how to solve it.  I have a list of program start/stop times, and I want to

> know how long each run takes to complete.  The thing that's really tripping

> me up is there are gaps in the sequence.  I've figured out how to collapse

> the results down to a single row per attempt, but I can't quite figure out

> how to further collapse down each full run to its own row.  It'd be easy if

> I had a session_id or something to group on, but I don't.  All I have are

> the start/stop times.

Looking at your data and the sample output you provided, it would

appear that the "real" start time of a program is always the one that

is started _after_ a completion (or else it's the very first start).

In other words, it's never the case that a "start" in the program_runs data

is the start of a new run when an existing, previously-started run

hasn't completed.

Is that right?  If so, then you ought to be able to use windowing

functions.  For each completion, pick the earliest start before it that is

_after_ a completion or, if there is no such completion, is the very

first record.  See the discussion of windowing functions in the manual

for how to do this.

That's right, but I'm not simply trying to see how long it's been between the "real" start and the "real" end.  I want to exclude the gaps between runs.  I'm looking for how many minutes it spent actually processing.

Carey