On Wed, 22 Apr 2009, Glenn Maynard wrote:
On Wed, Apr 22, 2009 at 8:19 AM, Stephen Frost <sfrost@xxxxxxxxxxx> wrote:
Yes, as I beleive was mentioned already, planning time for inserts is
really small. Parsing time for inserts when there's little parsing that
has to happen also isn't all *that* expensive and the same goes for
conversions from textual representations of data to binary.
We're starting to re-hash things, in my view. The low-hanging fruit is
doing multiple things in a single transaction, either by using COPY,
multi-value INSERTs, or just multiple INSERTs in a single transaction.
That's absolutely step one.
This is all well-known, covered information, but perhaps some numbers
will help drive this home. 40000 inserts into a single-column,
unindexed table; with predictable results:
separate inserts, no transaction: 21.21s
separate inserts, same transaction: 1.89s
are these done as seperate round trips?
i.e.
begin <send>
insert <send>
insert <send>
..
end <send>
or as one round trip?
i.e.
begin;insert;insert..;end
40 inserts, 100 rows/insert: 0.18s
one 40000-value insert: 0.16s
40 prepared inserts, 100 rows/insert: 0.15s
are one of these missing a 0?
COPY (text): 0.10s
COPY (binary): 0.10s
Of course, real workloads will change the weights, but this is more or
less the magnitude of difference I always see--batch your inserts into
single statements, and if that's not enough, skip to COPY.
thanks for this information, this is exactly what I was looking for.
can this get stored somewhere for reference?
David Lang
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance