In this usage scenario, doesn't the new HOT (heap only tuples) feature of PG8.3 help, in terms of the DB requiring less VACUUM maintenance? I am similarly performing a huge number of inserts, followed by a huge number of updates to fill in a few null fields. The data is indexed by insert time. My problem is, selects using that index degrade over time as updates are performed, presumably because data is no longer ordered sequentially across pages after updates are performed. I was hoping that HOT would help here and am actually installing PG8.3 now in order to perform some testing... Mike Andrew Sullivan wrote: > On Wed, Nov 14, 2007 at 10:28:30AM -0500, Gauthier, Dave wrote: >> null. My concern has to do with record fragmentation at the time of >> update because there's no room to "expand" them to accept the non-null >> data. (BTW, the columns are floating point). > > You have a mistaken idea about how this works. > >> Is there a way to initially insert nulls, but reserve space for the >> future update (and avoid record fragmentation)? > > No. > >> Is my record fragmentation concern unfounded? > > Sort of. > > The way this will work in Postgres is that, when you UPDATE the row, the old > row will be marked dead, and a _new_ row will be written out with the new > data. You will need to perform VACUUM in order to keep the table from > bloating. You'll want to read the manual carefully about this topic, in > order to keep your table from getting so bloated that your free space map > becomes useless. One of the weakest areas for PostgreSQL is its behaviour > under this sort of "most rows updated" scenario, and it is wise to plan > carefully how you will accomplish these sorts of activities without causing > yourself extreme pain. > > A > ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend