Re: intermittant performance problem

Mike Charnoky <noky@xxxxxxxxxxx> · Mon, 09 Mar 2009 22:21:25 -0400

The random sampling query is normally pretty snappy.  It usually takes 
on the order of 1 second to sample a few thousand rows of data out of a 
few million.  The sampling is consistently quick, too.  However, on some 
days, the sampling starts off quick, then when the process starts 
sampling from a different subset of data (different range of times for 
the same day), the sampling query takes a couple minutes.

Regarding the concurrent vacuuming, this is definitely not happening.  I 
always check pg_stat_activity whenever the sampling process starts to 
lag behind.  I have never seen a vacuum running during this time.

Interesting idea to issue the EXPLAIN first... I will see if I can 
instrument the sampling program to do this.

Thanks for your help Tom.

Mike

Tom Lane wrote:
Mike Charnoky <noky@xxxxxxxxxxx> writes:
The sampling query which runs really slow on some days looks something
like this:

INSERT INTO sampled_data
   (item_name, timestmp, ... )
   SELECT item_name, timestmp, ... )
   FROM raw_data
   WHERE timestmp >= ? and timestmp < ?
   AND item_name=?
   AND some_data_field NOTNULL
   ORDER BY random()
   LIMIT ?;

Hmph, I'd expect that that would run pretty slowly *all* the time :-(.
There's no good way to optimize "ORDER BY random()".  However, it seems
like the first thing you should do is modify the program so that it
issues an EXPLAIN for that right before actually doing the query, and
then you could see if the plan is different on the slow days.

We have done a great deal of PG tuning, including the autovacuum for the
"raw_data" table.  Autovacuum kicks like clockwork every day on that
table after the sampling process finishes (after one day's worth of data
is deleted from "raw_data" table, a roughly 7% change in size).

Also, are you sure you have ruled out the possibility that the problem
comes from autovac kicking in *while* the update is running?

			regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general