Re: High CPU Usage - PostgreSQL 7.3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Please Cc: the list when replying to things like this so everyone can see (and likely help).

I'm not sure what you're response is actually regarding. Could you give some more detail?

On Wed, 12 Jul 2006, Rizal wrote:

so, i must upgrade my PostgreSQL 803 which i have with a new version ?

----- Original Message -----
From: "Jeff Frost" <jeff@xxxxxxxxxxxxxxxxxxxxxx>
To: "Neil Hepworth" <nhepworth@xxxxxxxxx>
Cc: <pgsql-performance@xxxxxxxxxxxxxx>
Sent: Wednesday, July 12, 2006 10:27 AM
Subject: Re: [PERFORM] High CPU Usage - PostgreSQL 7.3


On Wed, 12 Jul 2006, Neil Hepworth wrote:

I am using version PostgreSQL 7.3.10 (RPM:
postgresql73-rhel21-7.3.10-2).  Unfortunately vacuumdb -a -v does not
give the FSM info at the end (need a newer version of postgres for
that).  Running the same queries on 8.1 reduces the time taken to
about 16 minutes, though I didn't run the test on the same hardware or
OS as I want to keep my test server as close to production as
possible, so I ran the 8.1 server on my Windows laptop (2GHz Centrino
Duo with 2GB of RAM, yes the laptop is brand new :).

Well, looks like you're at least fairly up to date, but there is a fix in
7.3.11 that you might want to get by upgrading to 7.3.15:

      * Fix race condition in transaction log management
        There was a narrow window in which an I/O operation could be
        initiated for the wrong page, leading to an Assert failure or data
        corruption.

It also appears that you can run autovacuum with 7.3 (I thought maybe it
only
went back as far as 7.4).

So, is the 16 minutes on your laptop with 8.1 for windows vs 1hr on the
server
for the whole set of loops?  If so, 4x isn't a bad improvement. :-)  So,
assuming you dumped/loaded the same DB onto your laptop's postgresql
server,
what does the vacuumdb -a -v say on the laptop?  Perhaps we can use it to
see
if your fsm settings are ok.

BTW, did you see Scott's posting here:

http://archives.postgresql.org/pgsql-performance/2006-07/msg00091.php

Since we didn't hear from you for a while, I thought perhaps Scott had hit
on
the fix.  Have you tried that yet?  It certainly would help the planner
out.

You might also want to turn on autovacuum and see if that helps.

What's your disk subsystem like?  In fact, what's the entire DB server
hardware like?


I run through a loop, executing the following or similar queries 8
times (well actually 12 but the last 4 don't do anything) - Jeff I've
attached complete outputs as files.

A debug output further below (numbers after each method call name,
above each SQL statement, are times to run that statement in
milliseconds, the times on the lines "" are cumulative).  So total for
one loop is 515 seconds, multiple by 8 and that gets me to over an
hour); it is actually the deletes that take the most time; 179 seconds
and 185 seconds each loop.

                ----------------------------------------------------

CREATE TABLE fttemp670743219 AS SELECT * FROM ftone LIMIT 0
INSERT INTO fttemp670743219 ( epId, start, direction, classid,
consolidation, cnt )  SELECT epId, TO_TIMESTAMP(start, 'YYYY-MM-DD
HH24:00:00.0')::timestamp AS start, direction, classid, 60 AS
consolidation, SUM(cnt) AS cnt FROM ftone WHERE consolidation = 0 AND
start < TO_TIMESTAMP('2006-07-11 14:04:34.156433+1000', 'YYYY-MM-DD
HH24:00:00.0')::timestamp GROUP BY epId, direction,
TO_TIMESTAMP(start, 'YYYY-MM-DD HH24:00:00.0')::timestamp, classid
DELETE FROM ONLY ftone WHERE ftone.epId = fttemp670743219.epId AND
ftone.direction = fttemp670743219.direction AND ftone.start =
fttemp670743219.start AND ftone.consolidation =
fttemp670743219.consolidation AND ftone.classid =
fttemp670743219.classid
INSERT INTO ftone ( epId, start, consolidation, direction, classid,
cnt ) SELECT epId, start, consolidation, direction, classid, cnt FROM
fttemp670743219
DROP TABLE fttemp670743219
DELETE FROM ftone WHERE consolidation = 0 AND start <
TO_TIMESTAMP((TO_TIMESTAMP('2006-07-11 14:04:34.156433+1000',
'YYYY-MM-DD 00:00:00.0')::timestamp - INTERVAL '10080 MINUTE'),
'YYYY-MM-DD 00:00:00.0')::timestamp

                ----------------------------------------------------

ftone: 0:
createConsolidatedInTemporary: 188:
CREATE TABLE fttemp678233382 AS SELECT * FROM ftone LIMIT 0
createConsolidatedInTemporary: 76783:
INSERT INTO fttemp678233382 ( epPairdefnid, start, direction, classid,
consolidation, cnt )  SELECT epPairdefnid, TO_TIMESTAMP(start,
'YYYY-MM-DD HH24:00:00.0')::timestamp AS start, direction, classid, 60
AS consolidation, SUM(cnt) AS cnt FROM ftone WHERE consolidation = 0
AND start < TO_TIMESTAMP('2006-07-12 11:02:13.865444+1000',
'YYYY-MM-DD HH24:00:00.0')::timestamp GROUP BY epPairdefnid,
direction, TO_TIMESTAMP(start, 'YYYY-MM-DD HH24:00:00.0')::timestamp,
classid
replaceConsolidatedInMainTable: 179178:
DELETE FROM ONLY ftone WHERE ftone.epPairdefnid =
fttemp678233382.epPairdefnid AND ftone.direction =
fttemp678233382.direction AND ftone.start = fttemp678233382.start AND
ftone.consolidation = fttemp678233382.consolidation AND ftone.classid
= fttemp678233382.classid
replaceConsolidatedInMainTable: 61705:
INSERT INTO ftone ( epPairdefnid, start, consolidation, direction,
classid, cnt ) SELECT epPairdefnid, start, consolidation, direction,
classid, cnt FROM fttemp678233382
consolidate: 2656:
DROP TABLE fttemp678233382
MAIN LOOP TOTAL consolidate: 320526
deleteOlderThan: 184616:
DELETE FROM ftone WHERE consolidation = 0 AND start <
TO_TIMESTAMP((TO_TIMESTAMP('2006-07-12 11:02:13.865444+1000',
'YYYY-MM-DD 00:00:00.0')::timestamp - INTERVAL '10080 MINUTE'),
'YYYY-MM-DD 00:00:00.0')::timestamp
MAIN LOOP TOTAL deleteExpiredData: 505142
MAIN LOOP TOTAL generateStatistics: 515611

                ----------------------------------------------------

Thanks again,
  Neil


On 11/07/06, Jeff Frost <jeff@xxxxxxxxxxxxxxxxxxxxxx> wrote:


On Mon, 10 Jul 2006, Neil Hepworth wrote:

I should also explain that I run through these queries on multiple
tables and with some slightly different parameters for the
"consolidation" so I run through those 3 queries (or similar) 9 times
and this takes a total of about 2 hours, with high CPU usage.  And I
am running the queries from a remote Java application (using JDBC),
the client is using postgresql-8.0-311.jdbc3.jar.  The explain
analyse
results I have provided below are from running via pgAdmin, not the
Java app (I did a vacuum analyse of the db before running them):



Neil, did you ever answer which version of 7.3 this is?

BTW, you mentioned that this takes 2 hours, but even looping over this
9
times
seems like it would only take 9 minutes (55 seconds for the SELECT and
4
seconds for the DELETE = 59 seconds times 9).  Perhaps you should post
the
explain analyze for the actual query that takes so long as the planner
output
will likely be quite different.

One thing I noticed is that the planner seems quite incorrect about the
number
of rows it expects in the SELECT.  If you ran vacuum analyze before
this,
perhaps your fsm settings are incorrect?  What does vacuumdb -a -v
output
at
the end?  I'm looking for something that looks like this:

INFO:  free space map: 109 relations, 204 pages stored; 1792 total
pages
needed
DETAIL:  Allocated FSM size: 1000 relations + 20000 pages = 182 kB
shared
memory.

I see your fsm settings are non-default, so it's also possible I'm not
used
to
reading 7.3's explain analyze output. :-)

Also, what does vmstat output look like while the query is running?
Perhaps
you're running into some context switching problems.  It would be
interesting
to know how the query runs on 8.1.x just to know if we're chasing an
optimization that's fixed already in a later version.


Subquery Scan "*SELECT*"  (cost=59690.11..62038.38 rows=23483
width=16) (actual time=16861.73..36473.12 rows=560094 loops=1)
->  Aggregate  (cost=59690.11..62038.38 rows=23483 width=16) (actual
time=16861.72..34243.63 rows=560094 loops=1)
      ->  Group  (cost=59690.11..61451.32 rows=234827 width=16)
(actual time=16861.62..20920.12 rows=709461 loops=1)
            ->  Sort  (cost=59690.11..60277.18 rows=234827 width=16)
(actual time=16861.62..18081.07 rows=709461 loops=1)
                  Sort Key: eppairdefnid, "start"
                  ->  Seq Scan on ftone  (cost=0.00..36446.66
rows=234827 width=16) (actual time=0.45..10320.91 rows=709461
loops=1)
                        Filter: ((consolidation = 60) AND ("start" <
(to_timestamp('2006-07-10 18:43:27.391103+1000'::text,
'YYYY-MM-DDHH24:00:00.0'::text))::timestamp without time zone))
Total runtime: 55378.68 msec

*** For the delete ***:

Hash Join  (cost=0.00..30020.31 rows=425 width=14) (actual
time=3767.47..3767.47 rows=0 loops=1)
Hash Cond: ("outer".eppairdefnid = "inner".eppairdefnid)
->  Seq Scan on ftone  (cost=0.00..23583.33 rows=1286333 width=10)
(actual time=0.04..2299.94 rows=1286333 loops=1)
->  Hash  (cost=0.00..0.00 rows=1 width=4) (actual
time=206.01..206.01 rows=0 loops=1)
      ->  Seq Scan on fttemp1600384653  (cost=0.00..0.00 rows=1
width=4) (actual time=206.00..206.00 rows=0 loops=1)
Total runtime: 3767.52 msec

--
Jeff Frost, Owner       <jeff@xxxxxxxxxxxxxxxxxxxxxx>
Frost Consulting, LLC   http://www.frostconsultingllc.com/
Phone: 650-780-7908     FAX: 650-649-1954



--
Jeff Frost, Owner <jeff@xxxxxxxxxxxxxxxxxxxxxx>
Frost Consulting, LLC http://www.frostconsultingllc.com/
Phone: 650-780-7908 FAX: 650-649-1954

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

               http://archives.postgresql.org





--
Jeff Frost, Owner 	<jeff@xxxxxxxxxxxxxxxxxxxxxx>
Frost Consulting, LLC 	http://www.frostconsultingllc.com/
Phone: 650-780-7908	FAX: 650-649-1954


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux