On Jun 18, 2013, at 12:23 PM, John R Pierce <pierce@xxxxxxxxxxxx> wrote: > On 6/18/2013 12:17 PM, Steven Schlansker wrote: >> 1) The common value is not known at schema definition time, and may change (very slowly) over time. > > how could a value thats constant in 95% of the rows change, unless you added 20 times more rows with a new value (and for a big portion of the time, no value would meet your 95% criteria). The table is a denormalized version of some packed data. The packed data is constant, but the extractor code changes over time. The value in question is a "extractor version used to create this row". There is a periodic job that attempts to find batches of rows that have fields extracted by an old version of the extractor. These rows are re-extracted from the packed data. So, most of the time the vast majority of rows will have CURRENT_VERSION as their version, and a small percentage of rows will have a previous version. The job will select rows where extracted_version != CURRENT_VERSION. If this query is not indexed, even doing a periodic check if any rows exist takes an absurd amount of time. At some point, the code changes, and CURRENT_VERSION gets incremented. Rows then slowly (over a period of days / weeks) get "upgraded" to the new current version, in batches of thousands. This is what I mean by a very slowly changing mostly-constant value. Hope that makes sense, Steven -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general