Ron wrote:
For accuracy's sake, which exact config did you finally use?
How did you choose the config you finally used? Did you test the three
options or just pick one?
(Note: I'm not the original poster.)
I just picked the option of putting the data/pg_xlog directory (WAL) on
a 2nd set of spindles. That was the easiest thing for me to change on
this test box.
The test server is simply a Gentoo box running software RAID and LVM2.
The primary disk set is 2x7200RPM 300GB drives and the secondary disk
set is 2x5400RPM 300GB drives. Brand new install of PGSQL 8.1, with
mostly default settings (I changed FSM pages to be a higher value,
max_fsm_pages = 150000). PGSQL was given it's own ext3 32GB LVM volume
on the primary disk set (2x7200RPM). Originally, all files were on the
primary disk.
The task at hand was inserting large quantity of ~45 byte rows
(according to "vacuum verbose"), on the order of millions of records per
table. There was an unique key and a unique index. Test clients were
accessing the database via ODBC / ADO and doing the inserts in a fairly
brute-force mode (attempt the insert, calling .CancelUpdate if it fails).
When the tables were under 2 million rows, performance was okay. At one
point, I had a 1.8Ghz P4, dual Opteron 246, and Opteron 148 CPUs running
at nearly 100% CPU processing and doing inserts into the database. So I
had 4 clients running, performing inserts to 4 separate tables in the
same database. The P4 ran at about half the throughput as the Opterons
(client-bound due to the code that generated row data prior to the
insert), so I'd score my throughput as roughly 3.3-3.4. Where 1.0 would
be full utilization of the Opteron 148 box.
However, once the tables started getting above ~2 million rows,
performance took a nose dive. CPU utilizations on the 4 client CPUs
dropped into the basement (5-20% CPU) and I had to back off on the
number of clients. So throughput had dropped down to around 0.25 or so.
The linux box was spending nearly all of its time waiting on the
primary disks.
Moving the data/pg_xlog (WAL) to the 2nd set of disks (2x5400RPM) in the
test server made a dramatic difference for this mass insert. I'm
running the P4 (100% CPU) and the Opteron 148 (~80% CPU) at the moment.
While it's not up to full speed, a throughput of ~1.3 is a lot better
then the ~0.25 that I was getting prior. (The two tables currently
being written have over 5 million rows each. One table has ~16 million
rows.) Wait percentage in "top" is only running 20-30% (dipping as low
as 10%). I haven't pushed this new setup hard enough to determine where
the upper limit for throughput is.
It's very much a niche test (millions of inserts of narrow rows into
multiple tables using fairly brain-dead code). But it gives me data
points on which to base purchasing of the production box. The original
plan was a simple RAID1 setup (2 spindles), but this tells me it's
better to order 4 spindles and set it up as a pair of RAID1 sets.
Whether 4 spindles is better as two separate RAID1 arrays, or configured
as a single RAID1+0 array... dunno. Our application is typically more
limited by insert speed then read speed (so I'm leaning towards separate
RAID arrays).
I'm sure there's also more tuning that could be done to the PGSQL
database (in the configuration file). Also, the code is throwaway code
that isn't the most elegant.