Well, the traditional DBMS way of dealing with this sort of
summarization when the tables involved do not fit into RAM is to
create a "roll up" table or tables for the time period commonly
summarized over.
Since it looks like you've got a table with a row per hour, create
another that has a row per day that summarizes hours 1...24.
Ditto weekly, monthly, quarterly, or any other time period you
frequently summarize over.
Yes, this is explicitly a "space for time" trade-off. DBs are
generally not very well suited to time series data.
I also find it, errr, =interesting= that your dedicated pg server
with 9 GB of RAM "never goes up above 1.8 in total usage".
That simply does not make sense if your OS and pg conf files are
configured correctly.
Make sure that you are running 64b RHEL 4 that is patched /
configured correctly to use the RAM you have.
(with 4 ?multi-core? CPUs, you =are= running a recent 2.6 based kernel, right?)
Ditto checking the pg conf file to make sure the values therein are sane.
With 9 GB of RAM, you should be able to:
=max out shared_buffers at 262143 (2 GB of shared buffers),
=set work_mem and maintenance_work_mem to considerably larger than
the defaults.
(If this query has the box to itself when running, you can set the
memory use parameters to values tuned specifically to the query.)
=just for giggles, boost max_stack_depth from 2 MB -> 4 MB
=set effective_cache_size to a realistic value given your HW + OS +
the tuning above.
The main point here is that most of your RAM should be in use. If
you are getting poor performance and most of the RAM is !not! in use,
Something's Wrong (tm).
Of course, the "holy grail" is to have the entire data set you are
operating over to be RAM resident during the query. If you can
manage that, said query should be =fast=.
RAM is cheap enough that if you can make this query RAM resident by a
reasonable combination of configuration + schema + RAM purchasing,
you should do it.
Cheers,
Ron Peacetree
At 03:07 PM 5/2/2007, Parks, Aaron B. wrote:
Ron:
I'm not sure how the JVM would really affect the issue as it is on a
Windows box connecting remotely. As indicated the PG Server itself has
9 gigs of ram and it never goes up above 1.8 total usage.
If the PG driver is doing something funny (IE waiting to send requests)
that's way out past my ability to fix it, so I will hope that's not it.
You can see the CPU slamming doing the queries, then after a while it
just stops and all I get is tiny little blips on the usage.
AP
-----Original Message-----
From: Ron [mailto:rjpeace@xxxxxxxxxxxxx]
Sent: Wednesday, May 02, 2007 2:55 PM
To: Parks, Aaron B.
Cc: pgsql-performance@xxxxxxxxxxxxxx
Subject: Re: [PERFORM] Intermitent slow queries
Among other possibilities, there's a known problem with slow memory
leaks in various JVM's under circumstances similar to those you are
describing.
The behavior you are describing is typical of this scenario. The
increasing delay is caused by longer and longer JVM garbage
collection runs as java attempts to reclaim enough memory from a
smaller and smaller universe of available memory.
The fastest test, and possible fix, is to go and buy more RAM. See
if 16MB of RAM, heck even 10MB, makes the problem go away or delays
it's onset. If so, there's good circumstantial evidence that you are
being bitten by a slow memory leak; most likely in the JVM.
Cheers,
Ron Peacetree
At 11:24 AM 5/2/2007, Parks, Aaron B. wrote:
>My pg 8.1 install on an AMD-64 box (4 processors) with 9 gigs of ram
>running RHEL4 is acting kind of odd and I thought I would see if
>anybody has any hints.
>
>I have Java program using postgresql-8.1-409.jdbc3.jar to connect
>over the network. In general it works very well. I have run batch
>updates with several thousand records repeatedly that has worked fine.
>
>The Program pulls a summation of the DB and does some processing
>with it. It starts off wonderfully running a query every .5
>seconds. Unfortunately, after a while it will start running queries
>that take 20 to 30 seconds.
>
>Looking at the EXPLAIN for the query no sequential scans are going
>on and everything has an index that points directly at its search
criteria.
>
>Example:
>
>Select sum(whatever) from a inner join b on a.something=b.something
>WHERE b.day=1 and b.hour=1
>Select sum(whatever) from a inner join b on a.something=b.something
>WHERE b.day=1 and b.hour=2
>Select sum(whatever) from a inner join b on a.something=b.something
>WHERE b.day=1 and b.hour=3
>.
>.
>Select sum(whatever) from a inner join b on a.something=b.something
>WHERE b.day=1 and b.hour=23
>Select sum(whatever) from a inner join b on a.something=b.something
>WHERE b.day=1 and b.hour=24
>Select sum(whatever) from a inner join b on a.something=b.something
>WHERE b.day=2 and b.hour=1
>Select sum(whatever) from a inner join b on a.something=b.something
>WHERE b.day=2 and b.hour=2
>.
>.
>.
>
>
>This query runs fine for a while (up to thousands of times). But
>what happens is that it starts to have really nasty pauses when you
>switch the day condition. After the first query with the day it
>runs like a charm for 24 iterations, then slows back down again
>
>My best guess was that an index never finished running, but REINDEX
>on the table (b in this case) didn't seem to help.
>
>Ideas?
>
>AP