Re: Q:Aggregrating Weekly Production Data. How do you do it?

Ow Mun Heng <Ow.Mun.Heng@xxxxxxx> · Wed, 19 Sep 2007 11:32:21 +0800

On Tue, 2007-09-18 at 09:55 -0500, Scott Marlowe wrote:
> On 9/17/07, Ow Mun Heng <Ow.Mun.Heng@xxxxxxx> wrote:
> 
> > Just wondering how everyone is doing aggregration of production data.
> 
> Where I work, we aggregate by the minute in the application, then dump
> those stats entries into the database en masse from each machine.  So,
[snip]
> The fail1/2/3 are types of failures, the totalresptime
> time is all the successful requests added together, so that
> totalresptime/success = average for that minute, and max is the
> longest running request ending in that minute.

So, it's basicaly a summation rather than using avg() all the way.
That would mean using something like bigint or something to deal with
the large numbers
> 
> > (getting min/max/count isn't much of an issue. Stdev is the main issue I
> > believe)
> >
> > One such instance I've read about is..
> 
> Isn't stddev() working for you?

Stdev is based on a population of data. if I do slices and if I want to
stdev across 2 months (8 weeks eg) then it would be wrong.

> 
> What I do is aggregate the minute time slices by grouping by
> date_trunc('xxx',timestamp) and then use that as a subselect to a
> query that does the stddev() outside of that.  works pretty well for
> us, and makes it easy to identify trends.

Trending analysis is very critical and the only reference I've found on
how to get "stdev" is based on what I posted.

> One of the things we started doing is to aggregate the last six weeks
> data by the day / hour and then comparing the last 24 hours worth of
> data to those six weeks worth to see trends for each hour of each day.
>  The queries are huge and ugly, but they work, and run in about 4
> minutes on a fairly good sized chunk of data.  We have about 150k to
> 300k entries a day put into this db.

I'm not sure how many K entries in a day(yet to count it) but I'm
getting into trouble w/ one-2-many relationships and PG is choosing to
do nested loops etc. (lots of left joins on same tables)

So, I've to resort to using SRF and function scans. a >2 hour Complex
query (on my laptopn,PG, runs in 20min on high end mssql) is reduced to
~60secs.

Thanks for the response.

BTW, are user queries an issue? Does it interfere with the loading.
(That's one of my concerns over here). I noticed  that you do the calc
at the app before sending it to the DB, which is a good thing and every
minute too. (is it adviseable to do it in chunks of 1min? less data per
minute vs per 30min/1 hour )

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend