On Wed, 25 Nov 2009, Grzegorz Jaśkiewicz wrote:
the out of order data layout is primary reason for index bloat. And that happens , and gets worse over time once data is more and more distributed. ("random" deletes, etc).
That's not index bloat. Sure, having the table not in the same order as the index will slow down an index scan, but that's a completely different problem altogether.
Index bloat is caused by exactly the same mechanism as table bloat. The index needs to have an entry for every row in the table that may be visible by anyone. As with the table, it is not possible to deterministically delete the rows as they become non-visible, so the index (and the table) will be left with dead entries on delete and update. The vacuum command performs garbage collection and marks these dead rows and index entries as free, so that some time in the future more data can be written to those places.
Index bloat is when there is an excessive amount of dead space in an index. It can be prevented by (auto)vacuuming regularly, but can only be reversed by REINDEX (or of course deleting the index, or adding loads of new entries to fill up the dead space after vacuuming).
Matthew -- for a in past present future; do for b in clients employers associates relatives neighbours pets; do echo "The opinions here in no way reflect the opinions of my $a $b." done; done -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance