Re: Massive table (500M rows) update nightmare

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Carlo Stonebanks wrote:
This is hosted on a new server the client set up so I am waiting for the exact OS and hardware config. PG Version is PostgreSQL 8.3.6, compiled by Visual C++ build 1400, OS appears to be Windows 2003 x64 Server.

More than anything, I am more concerned with the long-term use of the system. This particular challenge with the 500M row update is one thing, but I am concerned about the exceptional effort required to do this. Is it REALLY this exceptional to want to update 500M rows of data in this day and age? Or is the fact that we are considering dumping and restoring and dropping indexes, etc to do all an early warning that we don't have a solution that is scaled to the problem?

It's certainly not common or easy to handle. If someone told me I had to make that task well perform well and the tools at hand were anything other than a largish UNIX-ish server with a properly designed disk subsystem, I'd tell them it's unlikely to work well. An UPDATE is the most intensive single operation you can do in PostgreSQL; the full lifecycle of executing it requires:

-Creating a whole new row
-Updating all the indexes to point to the new row
-Marking the original row dead
-Pruning the original row out of the database once it's no longer visible anywhere (VACUUM)

As a rule, if you can't fit a large chunk of the indexes involved in shared_buffers on your system, this is probably going to have terrible performance. And you're on a platform where that's very hard to do, even if there's a lot of RAM around. It sounds like your system spends all its time swapping index blocks in and out of the database buffer cache here. That suggests there's a design problem here either with those indexes (they're too big) or with the provisioned hardware/software combination (needs more RAM, faster disks, or a platform where large amounts of RAM work better).

--
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@xxxxxxxxxxxxxxx  www.2ndQuadrant.com


--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux