Re: PostgreSQL performance issues

Rusty Conover <rconover@xxxxxxxxxxxxx> · Tue, 29 Aug 2006 15:47:17 -0600

On Aug 29, 2006, at 7:52 AM, Willo van der Merwe wrote:

  Hi,

 We're running PostgreSQL 8.1.4 on CentOS 4 (Linux version 2.6.9-34.0.1.ELsmp). Hardware specs:
  2x AMD Dual-Core Opteron 270 Italy 1Ghz HT 2 x 1MB L2 Cache Socket 940
4 GB Registered ECC PC3200 DDR RAM
SuperMicro Server-Class 1U AS1020S series system
Dual-channel Ultra320 SCSI controller
1 x 73 GB 10,000rpm Ultra320 SCSI drive with 8MB cache
 I use it to drive a web application. Everything was working fine when all of a sudden today, things went belly up. Load on the server started increasing and query speeds decreased rapidly. After dropping all the clients I did some quick tests and found the following:

 I have a log table looking like this:
                            Table "public.log"
  Column  |            Type             |            Modifiers
 ---------+-----------------------------+---------------------------------
  site    | bigint                      | not null
  stamp   | timestamp without time zone | default now()
  type    | character(8)                | not null default 'log'::bpchar
  user    | text                        | not null default 'public'::text
  message | text                        |
 Indexes:
     "fki_log_sites" btree (site)
     "ix_log_stamp" btree (stamp)
     "ix_log_type" btree ("type")
     "ix_log_user" btree ("user")
 Foreign-key constraints:
     "log_sites" FOREIGN KEY (site) REFERENCES sites(id) ON UPDATE CASCADE ON DELETE CASCADE

 and it has 743321 rows and a explain analyze select count(*) from property_values;
                                                             QUERY PLAN                                                        
 ----------------------------------------------------------------------------------------------------------------------------------
  Aggregate  (cost=55121.95..55121.96 rows=1 width=0) (actual time=4557.797..4557.798 rows=1 loops=1)
    ->  Seq Scan on property_values  (cost=0.00..51848.56 rows=1309356 width=0) (actual time=0.026..2581.418 rows=1309498 loops=1)
  Total runtime: 4557.978 ms
 (3 rows)

 4 1/2 seconds for a count(*) ? This seems a bit rough - is there anything else I can try to optimize my Database? You can imagine that slightly more complex queries goes out the roof.

 Any help appreciated

 Regards

 Willo van der Merwe

Hi,

What about doing a little bit of normalization? 

With 700k rows you could probably gain some improvements by:

* normalizing the type and user columns to integer keys (dropping the 8 byte overhead for storing the field lengths)
* maybe change the type column so that its a smallint if there is just a small range of possible values (emulating a enum type in other databases) rather the joining to another table.
* maybe move message (if the majority of the rows are big and not null but not big enough to be TOASTed, ergo causing only a small number of rows to fit onto a 8k page) out of this table into a separate table that is joined only when you need the column's content.

Doing these things would fit more rows onto each page, making the scan less intensive by not causing the drive to seek as much.  Of course all of these suggestions depend on your workload.

Cheers,

Rusty
 --
Rusty Conover
InfoGears Inc.