On Thu, 14 Jan 2010 14:17:13 -0500, "Carlo Stonebanks" <stonec.register@xxxxxxxxxxxx> wrote: > My client just informed me that new hardware is available for our DB > server. > > . Intel Core 2 Quads Quad > . 48 GB RAM > . 4 Disk RAID drive (RAID level TBD) > > I have put the ugly details of what we do with our DB below, as well as > the > postgres.conf settings. But, to summarize: we have a PostgreSQL 8.3.6 DB > with very large tables and the server is always busy serving a constant > stream of single-row UPDATEs and INSERTs from parallel automated processes. > > There are less than 10 users, as the server is devoted to the KB > production > system. > > My questions: > > 1) Which RAID level would you recommend 10 > 2) Which Windows OS would you recommend? (currently 2008 x64 Server) If you have to run Windows... that works. > 3) If we were to port to a *NIX flavour, which would you recommend? (which > support trouble-free PG builds/makes please!) Community driven: Debian Stable CentOS 5 Commercial: Ubuntu LTS RHEL 5 > 4) Is this the right PG version for our needs? You want to run at least the latest stable 8.3 series which I believe is 8.3.9. With the imminent release of 8.5 (6 months), it may be time to move to 8.4.2 instead. Joshua D. Drake > > Thanks, > > Carlo > > The details of our use: > > . The DB hosts is a data warehouse and a knowledgebase (KB) tracking the > professional information of 1.3M individuals. > . The KB tables related to these 130M individuals are naturally also large > . The DB is in a perpetual state of serving TCL-scripted Extract, > Transform > and Load (ETL) processes > . These ETL processes typically run 10 at-a-time (i.e. in parallel) > . We would like to run more, but the server appears to be the bottleneck > . The ETL write processes are 99% single row UPDATEs or INSERTs. > . There are few, if any DELETEs > . The ETL source data are "import tables" > . The import tables are permanently kept in the data warehouse so that we > can trace the original source of any information. > . There are 6000+ and counting > . The import tables number from dozens to hundreds of thousands of rows. > They rarely require more than a pkey index. > . Linking the KB to the source import date requires an "audit table" of > 500M > rows, and counting. > . The size of the audit table makes it very difficult to manage, > especially > if we need to modify the design. > . Because we query the audit table different ways to audit the ETL > processes > decisions, almost every column in the audit table is indexed. > . The maximum number of physical users is 10 and these users RARELY > perform > any kind of write > . By contrast, the 10+ ETL processes are writing constantly > . We find that internal stats drift, for whatever reason, causing row seq > scans instead of index scans. > . So far, we have never seen a situation where a seq scan has improved > performance, which I would attribute to the size of the tables > . We believe our requirements are exceptional, and we would benefit > immensely from setting up the PG planner to always favour index-oriented > decisions - which seems to contradict everything that PG advice suggests > as > best practice. > > Current non-default conf settings are: > > autovacuum = on > autovacuum_analyze_scale_factor = 0.1 > autovacuum_analyze_threshold = 250 > autovacuum_naptime = 1min > autovacuum_vacuum_scale_factor = 0.2 > autovacuum_vacuum_threshold = 500 > bgwriter_lru_maxpages = 100 > checkpoint_segments = 64 > checkpoint_warning = 290 > datestyle = 'iso, mdy' > default_text_search_config = 'pg_catalog.english' > lc_messages = 'C' > lc_monetary = 'C' > lc_numeric = 'C' > lc_time = 'C' > log_destination = 'stderr' > log_line_prefix = '%t ' > logging_collector = on > maintenance_work_mem = 16MB > max_connections = 200 > max_fsm_pages = 204800 > max_locks_per_transaction = 128 > port = 5432 > shared_buffers = 500MB > vacuum_cost_delay = 100 > work_mem = 512MB -- PostgreSQL - XMPP: jdrake(at)jabber(dot)postgresql(dot)org Consulting, Development, Support, Training 503-667-4564 - http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997 -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance