On Fri, 12 Sep 2008, William Garrison wrote:
Is there a definitive list of things to do?
That section of the documention is pretty good:
http://www.postgresql.org/docs/current/static/populate.html
The main thing it's missing is a discussion of how to cut down on disk
commit overhead by either usinc async commit or turning fsync off. If
you've got a good caching controller that may not be needed though.
The other large chunk of information it doesn't really go into is what
server tuning you could do to improve general performance, which obviously
would then help with loading as well.
http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server goes over
much of that.
* Turn off full_page_writes
Don’t write the WAL archives in a safe way. But we don’t need WAL archives
during a restore. Future versions of postgres will let you turn off WAL
archives entirely
Ideally you'd be using COPY such that the table was just created or
truncated before loading, which (if archive_mode is off) keeps them from
being WAL logged, as described in 14.4.7. If you do that and vastly
increase checkpoint_segments, full_page_writes has minimal impact.
* Increase the checkpoint_segments parameter (the default is 3 – so... maybe
10?)
64-256 is the usual range you'll see people using for bulk loading.
* Increase the maintenance_work_mem setting to 512MB
I haven't really seen any real improvement setting that over 256MB. If
you've got RAM to waste it doesn't really matter if you set it too high
though.
--
* Greg Smith gsmith@xxxxxxxxxxxxx http://www.gregsmith.com Baltimore, MD