Re: Performace Optimization for Dummies

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 9/28/06, Carlo Stonebanks <stonec.register@xxxxxxxxxxxx> wrote:
> are you using the 'copy' interface?

Straightforward inserts - the import data has to transformed, normalised and
de-duped by the import program. I imagine the copy interface is for more
straightforward data importing. These are - buy necessity - single row
inserts.

right. see comments below.

> thats a tough question.  my gut says that windows will not scale as
> well as recent linux kernels in high load environments.

But not in the case of a single import program trying to seed a massive
database?

probably not.

> hearing good things about the woodcrest. pre-woodcrest xeon (dempsey
> down) is outclassed by the opteron.

Need to find a way to deterimine the Xeon type. The server was bought in
early 2006, and it looks like woodcrest was form July.

ok, there are better chips out there but again this is not something
you would really notice outside of high load environements.

> 1. can probably run fsync=off during the import
> 2. if import is single proecess, consider temporary bump to memory for
> index creation. or, since you have four cores consider having four
> processes import the data somehow.
> 3. turn off stats collector, stats_command_string, stats_row_level,
> and autovacuum during import.

by the way, stats_command_string is a known performance killer that
iirc was improved in 8.2. just fyi.

I would suggest at least consideration of retooling your import as
follows...it might be a fun project to learn some postgresql
internals.  I'm assuming you are doing some script preprocessing in a
language like perl:

bulk load denomalized tables into scratch tables into the postgresql
database. create indexes appropriate to the nomalization process
remembering you can index on virtually any expression in postgresql
(including regex substitution).

use sql to process the data. if tables are too large to handle with
monolithic queries, use cursors and/or functions to handle the
conversion.  now you can keep track of progress using pl/pgsql raise
notice command for example.

merlin


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux