Re: High concurrency OLTP database performance tuning

Casey Duncan <casey@xxxxxxxxxxx> · Thu, 31 Aug 2006 10:50:14 -0700

It will be very important to determine if as performance degrades you  
are either i/o bound, cpu bound or hindered by some other contention  
(db locks, context switching, etc).

Try turning on statement duration logging for all statments or "slow"  
statments (like those over 100ms or some arbitrary threshold). Either  
eyeball or write a script to see which statement(s) are frequently  
slowest. This can greatly aid in tuning.

You say the db is write intensive. In what way, inserts or updates?  
The former tend to be much cheaper than the latter. If the latter are  
things being adequately vacuumed? loss of dead tuple space can really  
hurt performance. If you have lots of concurrent writes, commit_delay/ 
commit_siblings can help, as can increasing checkpoint_segments  
further. I see you have fsync off, are you feeling lucky? ;^)

If you are i/o bound see what the disks are doing. How fast are they  
reading/writing? How close are they to their max throughput?  
Typically I find the disks are nowhere near that due to excessive  
seeking. If that's the case you can typically only fix it by putting  
more of the DB in RAM -- buy more RAM, crank up shared_buffers I  
would say double what you have it, maybe more (much more with 8.1),  
or by arranging the data better on disk (clustering, denormalizing  
data, putting tables and indices on different disks, etc).

-Casey

On Aug 31, 2006, at 8:45 AM, Cosimo Streppone wrote:

Good morning,

I'd like to ask you some advice on pg tuning in a high
concurrency OLTP-like environment.
The application I'm talking about is running on Pg 8.0.1.
Under average users load, iostat and vmstat show that iowait stays
well under 1%. Tables and indexes scan and seek times are also good.
I can be reasonably sure that disk I/O is not the *main* bottleneck
here.

These OLTP transactions are composed each of 50-1000+ small  
queries, on
single tables or 2/3 joined tables. Write operations are very  
frequent,
and done concurrently by many users on the same data.

Often there are also queries which involve record lookups like:

  SELECT DISTINCT rowid2 FROM table
  WHERE rowid1    IN (<long_list_of_numerical_ids>) OR
        refrowid1 IN (<long_list_of_numerical_ids>)

These files are structured with rowid fields which link
other external tables, and the links are fairly complex to follow.
SQL queries and indexes have been carefully(?) built and tested,
each with its own "explain analyze".

The problem is that under peak load, when n. of concurrent  
transactions
raises, there is a sensible performance degradation.
I'm looking for tuning ideas/tests. I plan to concentrate,
in priority order, on:

- postgresql.conf, especially:
     effective_cache_size (now 5000)
     bgwriter_delay (500)
     commit_delay/commit_siblings (default)
- start to use tablespaces for most intensive tables
- analyze the locks situation while queries run
- upgrade to 8.1.n
- convert db partition filesystem to ext2/xfs?
  (now ext3+noatime+data=writeback)
- ???

Server specs:
   2 x P4 Xeon 2.8 Ghz
   4 Gb RAM
   LSI Logic SCSI 2x U320 controller
    6 disks in raid 1 for os, /var, WAL
   14 disks in raid 10 for db on FC connected storage

Current config is now (the rest is like the default):
  max_connections = 100
  shared_buffers = 8192
  work_mem = 8192
  maintenance_work_mem = 262144
  max_fsm_pages = 200000
  max_fsm_relations = 1000
  bgwriter_delay = 500
  fsync = false
  wal_buffers = 256
  checkpoint_segments = 32
  effective_cache_size = 5000
  random_page_cost = 2

Thanks for your ideas...

--
Cosimo

---------------------------(end of  
broadcast)---------------------------
TIP 4: Have you searched our list archives?

              http://archives.postgresql.org