Re: Select ranges based on sequential breaks

Mike Toews <mwtoews@xxxxxx> · Mon, 22 Jun 2009 15:05:14 -0700

Scott Marlowe wrote:
On Mon, Jun 15, 2009 at 12:23 PM, Mike Toews<mwtoews@xxxxxx> wrote:

This is easy to compute using a spreadsheet or in R, but how would I do this
with SQL? I'm using 8.3. Advice is appreciated.

FYI (and I'm no expert in this area) R is available as a pl for
postgres, look for pl/R or plR

FYI, here is how I implement ranges on sequential breaks in R. Sorry, I 
haven't meddled with plR yet, although I'm experience with both R and 
postgres. This is all R code:

# Randomly sampled bins: "red", "blue"
dat <- data.frame(date=seq(as.Date("2009-01-01"), by="days", length.out=20))
dat$bin <- factor(sample(c("red","blue"), 10, replace=TRUE, 
prob=c(0.4,0.6)))

# Determine where the rows are different; 1=different rows, 0=same rows
dat$breaks <- ifelse(dat$bin != c(TRUE, 
as.character(dat$bin[-nrow(dat)])), 1, 0)

# Determine where the continuous parts are:
dat$part <- factor(cumsum(dat$breaks))

# Results vary due to random sampling
print(dat)

... and on the SQL side, simple aggregates like min(), max(etc) can be 
used with "GROUP BY part" to determine the start/end dates, length of 
duration, etc.

-Mike

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general