Re: Massive table (500M rows) update nightmare

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Carlo Stonebanks wrote:
> Our DB has an audit table which is 500M rows and growing. (FYI the
> objects being audited are grouped semantically, not individual field
> values).
> 
> Recently we wanted to add a new feature and we altered the table to add
> a new column. We are backfilling this varchar(255) column by writing a
> TCL script to page through the rows (where every update is a UPDATE ...
> WHERE id >= x AND id < x+10 and a commit is performed after every 1000
> updates statement, i.e. every 10000 rows.)
> 
> We have 10 columns, six of which are indexed. Rough calculations suggest
> that this will take two to three weeks to complete on an 8-core CPU with
> more than enough memory.
> 
> As a ballpark estimate - is this sort of performance for an 500M updates
> what one would expect of PG given the table structure (detailed below)
> or should I dig deeper to look for performance issues?
> 
> As always, thanks!
> 
> Carlo
> 

If it is possible to lock this audit table exclusively (may be during
off peak hours) I would look into
- create new_audit_table as select col1, col2, col3 ... col9,
'new_col_value' from old_audit_table;
- create all indexes
- drop old_audit_table
- rename new_audit_table to old_audit_table

That is probably the fasted method you can do, even if you have to join
the "new_col_value" from an extra helper-table with the correspondig id.
Remeber, databases are born to join.

You could also try to just update the whole table in one go, it is
probably faster than you expect.

just a thought
Leo

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux