On 2/6/07, Scott Marlowe <smarlowe@xxxxxxxxxxxxxxxxx> wrote:
On Mon, 2007-02-05 at 18:35, Karen Hill wrote: > I have a pl/pgsql function that is inserting 200,000 records for > testing purposes. What is the expected time frame for this operation > on a pc with 1/2 a gig of ram and a 7200 RPM disk? The processor is > a 2ghz cpu. So far I've been sitting here for about 2 million ms > waiting for it to complete, and I'm not sure how many inserts postgres > is doing per second. That really depends. Doing 200,000 inserts as individual transactions will be fairly slow. Since PostgreSQL generally runs in autocommit mode, this means that if you didn't expressly begin a transaction, you are in fact inserting each row as a transaction. i.e. this:
I think OP is doing insertion inside a pl/pgsql loop...transaction is implied here. For creating test data, generate_series or insert...select is obviously the way to go. If that's unsuitable for some reason, I would suggest RAISE NOTICE every n records so you can monitor the progress and make sure something is not binding up in a lock or something like that. Be especially wary of degrading performance during the process. Another common problem with poor insert performance is a RI check to an un-indexed column. In-transaction insert performance should be between 1k and 10k records/second in normal situations, meaning if you haven't inserted 1 million records inside of an hour something else is going on. Generally, insertion performance from fastest to slowest is: * insert select generate_series... * insert select * copy * insert (),(),()[...] (at least 10 or preferably 100 insertions) * begin, prepare, n prepared inserts executed, commit * begin, n inserts, commit * plpgsql loop, single inserts * n inserts outside of transaction. The order of which is faster might not be absolutely set in stone (copy might beat insert select for example), but the top 4 methods will always be much faster than the bottom 4. merlin