Brian, Any idea what your bottleneck is? You can find out at a crude level by attaching an strace to the running backend, assuming it¹s running long enough to grab it, then look at what the system call breakdown is. Basically, run one of your long insert streams, do a ³top² to find which process id the backend is using (the <pid>), then run this: strace -p <pid> -c And CTRL-C after a few seconds to see a breakdown of system calls. I think what you'll see is that for the small number of inserts per TXN, you'll be bottlenecked on fsync() calls, or fdatasync() if you defaulted it. Things might speed up a whole lot there depending on your choice of one or the other. - Luke On 6/19/06 5:09 PM, "Brian Hurt" <bhurt@xxxxxxxxxxxxxxxxx> wrote: > > > For long involved reasons I'm hanging out late at work today, and rather > than doing real, productive work, I thought I'd run some benchmarks > against our development PostgreSQL database server. My conclusions are > at the end. > > The purpose of the benchmarking was to find out how fast Postgres was, > or to compare Postgres to other databases, but to instead answer the > question: when does it become worthwhile to switch over to using COPYs > instead of INSERTS, and by how much? This benchmark should in no way be > used to gauge absolute performance of PostgreSQL. > > The machine in question: a new HP-145 rack mount server, with a > single-socket dual-core 1.8GHz Opteron 275, 1M of cache per core, with > 4G of memory, running Redhat Linux (forget which version). Database was > on the local single SATA hard disk- no raid. From the numbers, I'm > assuming the disk honors fsync. Some tuning of the database was done, > specifically shared_buffers was upped to 2500 and temp_buffers to 1500 > (mental note to self: must increase these signifigantly more. Forgot > they were so low). fsync is definately on. Test program was written in > Ocaml, compiled to native code, using the Ocaml Postgresql connection > library (Ocaml bindings of the libpgsql library). The test was single > threaded- only one insert going on at a time, run over the local gigabit > ethernet network from a remote machine. > > The table design was very simple: > CREATE TABLE copytest ( > id SERIAL PRIMARY KEY NOT NULL, > name VARCHAR(64), > thread INT, > block INT, > num INT); > > The id column was not specified either in the inserts or in the copies, > instead it just came from the sequence. Other than the id, there are no > indexes on the table. Numbers are approximate. > > Results: > > Inserts, 1 per transaction* 83 inserts/second > Inserts, 5 per transaction 419 inserts/second > Inserts, 10 per transaction 843 inserts/second > Inserts, 50 per transaction ~3,100 inserts/second > Inserts, 100 per transaction ~4,200 inserts/second > Inserts, 1,000 per transaction ~5,400 inserts/second > Copy, 5 element blocks ~405 inserts/second > Copy, 10 element blocks ~700 inserts/second > Copy, 50 element blocks ~3,400 inserts/second > Copy, 100 element blocks ~6,000 inserts/second > Copy, 1,000 element blocks ~20,000 inserts/second > Copy, 10,000 element blocks ~27,500 inserts/second > Copy, 100,000 element blocks ~27,600 inserts/second > > * The singleton inserts were not done in an explicit begin/end block, > but were instead "unadorned" inserts. > > Some conclusions: > > 1) Transaction time is a huge hit on the small block sizes. Going from > 1 insert per transaction to 10 inserts per transaction gives a 10x speed > up. Once the block size gets large enough (10's to 100's of elements > per block) the cost of a transaction becomes less of a problem. > > 2) Both insert methods hit fairly hard walls of diminishing returns were > larger block sizes gave little performance advantage, tending to no > performance advantage. > > 3) For small enough block sizes, inserts are actually faster than > copies- but not by much. There is a broad plateau, spanning at least > the 5 through 100 elements per block (more than an order of magnitude), > where the performance of the two are roughly identical. For the general > case, I'd be inclined to switch to copies sooner (at 5 or so elements > per block) rather than later. > > 4) At the high end, copies vastly outperformed inserts. At 1,000 > elements per block, the copy was almost 4x faster than inserts. This > widened to ~5x before copy started topping out. > > 5) The performance of Postgres, at least on inserts, depends critically > on how you program it. One the same hardware, performance for me varied > over a factor of over 300-fold, 2.5 orders of magnitude. Programs which > are unaware of transactions and are designed to be highly portable are > likely to hit the abysmal side of performance, where the transaction > overhead kills performance. I'm not sure there is a fix for this (let > alone an easy fix)- simply dropping transactions is obviously not it. > Programs that are transaction aware and willing to use > PostgreSQL-specific features can get surprisingly excellent > performance. Simply being transaction-aware and doing multiple inserts > per transaction greatly increases performance, giving an easy order of > magnitude increase (wrapping 10 inserts in a transaction gives a 10x > performance boost). > > Brian > > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend > >