Re: Variable (degrading) perfomance

"Merlin Moncure" <mmoncure@xxxxxxxxx> · Fri, 15 Jun 2007 15:39:22 -0400

On 6/11/07, Vladimir Stankovic <V.Stankovic@xxxxxxxxxx> wrote:
Hi all,

It seems that I have an issue with the performance of a PostgreSQL server.

I'm running write-intensive, TPC-C like tests. The workload consist of
150 to 200 thousand transactions. The performance varies dramatically,
between 5 and more than 9 hours (I don't have the exact figure for the
longest experiment). Initially the server is relatively fast. It
finishes the first batch of 50k transactions in an hour. This is
probably due to the fact that the database is RAM-resident during this
interval. As soon as the database grows bigger than the RAM the
performance, not surprisingly, degrades, because of the slow disks.
My problem is that the performance is rather variable, and to me
non-deterministic. A 150k test can finish in approx. 3h30mins but
conversely  it can take more than 5h to complete.
Preferably I would like to see *steady-state* performance (where my
interpretation of the steady-state is that the average
throughput/response time does not change over time). Is the steady-state
achievable despite the MVCC and the inherent non-determinism between
experiments? What could be the reasons for the variable performance?
- misconfiguration of the PG parameters (e.g. autovacuum does not cope
with the dead tuples on the MVCC architecture)
- file fragmentation
- index bloat
- ???

vmstat is telling you that the server is i/o bound.  an iostat will
tell be helpful to tell you where things are binding up...either the
data volume, wal volume, or both.  I suspect your sorts are spilling
to disk which is likely the cause of the variable performance,
interacting with autovacuum.  Another possibility is vacuum is bogging
you down.  look for pg_tmp folders inside the database tree to see if
this is happening.   Also you want to see if your server is swapping.

first, I'd suggest bumping maintenance_work_mem to 256mb.   I'd also
suggest bumping work_mem higher, but you are going to have to
calculate how far to go based on how many active queries with sort are
going to fire simultaneously.  It can be a fine line because your a
bit underpowered memory but your database is small as well.  bumping
work_mem but throwing your server into swap solves nothing.

merlin