On Jun 18, 2013, at 1:49 PM, David Johnston <polobo@xxxxxxxxx> wrote: > Steven Schlansker-3 wrote >> At some point, the code changes, and CURRENT_VERSION gets incremented. >> Rows then slowly (over a period of days / weeks) get "upgraded" to the new >> current version, in batches of thousands. >> >> This is what I mean by a very slowly changing mostly-constant value. > > This seems insane without knowing the details. This seems like it would be > more of a cache invalidation problem. What percentage of your rows are > being updated multiple times without ever being queried for other reasons? I am open to suggestions of how to do it better. The problem I face is that doing any sort of updates in one big go -- whether it be by ALTER TABLE statements or large UPDATE queries -- is all but unworkable. It takes days or weeks depending on what the update is, so any locking causes the entire system to grind to a halt. And there is nothing more depressing than losing 5 days of work on a huge UPDATE because something hiccuped. Hence, allowing "outdated" versions in the table, which then over time get upgraded in reasonably-sized batches. > > I was going to say that table partitioning (INHERITS) seems like a > possibility; then I thought maybe not; now I'm back to suggesting you > consider it. > > Every version of the extractor would get its own table. To "upgrade" you > remove the record from the older table and add it to the newer one. Maybe > even consider calling the these "version_upgraded" to distinguish them from > records originally insert using the newest version. Or have "original > version" as the partition key and a second "current version" field that > varies. Not sure how the planner would be able to use constraint exclusion > to limiting the scanning though… > Interesting idea. I have been trying to avoid making code changes require schema changes as well -- it is very nice to not have to make schema changes for every code deployment. The code may get changed multiple times in the same day, if I am busy hacking on it. Having to muck around with table inheritance and changing partition definitions on code deployments seems unpleasant. Perhaps I am overestimating the work involved, but I am very much trying to keep the deployment process as brain-dead-simple as possible. Thanks for the input. Steven -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general