I thought that post mentioned that the plan was one statement in an iteration, and that the cache would have been primed by a previous query checking whether there were any rows to update. If that was the case, it might be worthwhile to look at the entire flow of an iteration.
This is the only SQL query in the code in question - the rest of the code manages the looping and commit. The code was copied to PgAdminIII and values written in for the WHERE clause. In order for me to validate that rows would have been updated, I had to run a SELECT with the same WHERE clause in PgAdminIII first to see how many rows would have qualified. But this was for testing purposes only. The SELECT statement does not exist in the code. The vast majority of the rows that will be processed will be updated as this is a backfill to synch the old rows with the values being filled into new columns now being inserted.
Also, if you ever responded with version and configuration information, I missed it.
This is hosted on a new server the client set up so I am waiting for the exact OS and hardware config. PG Version is PostgreSQL 8.3.6, compiled by Visual C++ build 1400, OS appears to be Windows 2003 x64 Server.
More than anything, I am more concerned with the long-term use of the system. This particular challenge with the 500M row update is one thing, but I am concerned about the exceptional effort required to do this. Is it REALLY this exceptional to want to update 500M rows of data in this day and age? Or is the fact that we are considering dumping and restoring and dropping indexes, etc to do all an early warning that we don't have a solution that is scaled to the problem?
Config data follows (I am assuming commented values which I did not include are defaulted).
Carlo autovacuum = on autovacuum_analyze_scale_factor = 0.1 autovacuum_analyze_threshold = 250 autovacuum_naptime = 1min autovacuum_vacuum_scale_factor = 0.2 autovacuum_vacuum_threshold = 500 bgwriter_lru_maxpages = 100 checkpoint_segments = 64 checkpoint_warning = 290 datestyle = 'iso, mdy' default_text_search_config = 'pg_catalog.english' lc_messages = 'C' lc_monetary = 'C' lc_numeric = 'C' lc_time = 'C' log_destination = 'stderr' log_line_prefix = '%t ' logging_collector = on maintenance_work_mem = 16MB max_connections = 200 max_fsm_pages = 204800 max_locks_per_transaction = 128 port = 5432 shared_buffers = 500MB vacuum_cost_delay = 100 work_mem = 512MB -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance