Re: Bunching "transactions"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jean-David Beyer wrote:
> This means, of course, that the things I think of as transactions have been
> bunched into a much smaller number of what postgreSQL thinks of as large
> transactions, since there is only one per file rather than one per record.
> Now if a file has several thousand records, this seems to work out just great.

Using the small transactions, you were limited by the speed your hard
disk flush the commit WAL records to the disk. With small transactions
like that, it's not about the bandwidth, but latency of the hard drive.
Using larger transactions helps because you get more work done on each
disk operation.

Upcoming 8.3 release will have a feature called "asynchronous commit",
which should speed up those small transactions dramatically, if you
don't want to batch them into larger transactions like you did:

http://www.postgresql.org/docs/8.3/static/runtime-config-wal.html#GUC-SYNCHRONOUS-COMMIT

> But what is the limitation on such a thing? In this case, I am just
> populating the database and there are no other users at such a time. I am
> willing to lose the whole insert of a file if something goes wrong -- I
> would fix whatever went wrong and start over anyway.
> 
> But at some point, disk IO would have to be done. Is this just a function of
> how big /pgsql/data/postgresql.conf's shared_buffers is set to? Or does it
> have to do with wal_buffers and checkpoint_segments?

Well, you have to do the I/O eventually, regardless of shared_buffers.
Common wisdom is that increasing wal_buffers from the default helps with
bulk loading like that, up to a point. Increasing checkpoint_segments
helps as well. After you've done all that, you're going to be limited by
either the bandwidth of your I/O system, or the speed of your CPU,
depending on your hardware. Using COPY instead of INSERTs will help if
it's CPU.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux