On 07/20/2011 12:58 PM, A J wrote:
As with everything the answer is "it depends". For a "typical" workload where the rows updated by a single query are one or a few rowsl, the automatic vacuum process should handle everything for you without intervention (some tuning of the vacuum settings may be in order but it runs by itself). When you update a row in PostgreSQL, the server inserts the "new" version of the row but also keeps the "old" row available until it is no longer visible to any running queries after which that space can be reclaimed. Locating reclaimable space and making it available for reuse is the job of the autovacuum process. If everything is tuned properly, PostgreSQL will keep refilling the "holes" in the table on its own. There are some exceptions. A query that updates all rows in a table as might happen in some maintenance operations may increase the table-size more than you desire. This can be corrected using the cluster command. Also, there are some workloads that have queries that grab chunks of data, say all records for a given date, where keeping the associated data physically close can improve performance. If you are evaluating PG, you might want to post some info on the nature of the application. People are here because they like and use PostgreSQL but most are quite open about saying when it isn't an appropriate solution as well. You may also get tips on avoiding some common pitfalls that can prevent PostgreSQL from showing its full potential. Cheers, Steve |