Thanks for this description--we have index bloat problems on a massively active (but small) database.This may help shed light on our problems.
Sorry for top-posting--challenged email reader.
Greg W.
From: Jeff Janes <jeff.janes@xxxxxxxxx>
To: Strahinja Kustudić <strahinjak@xxxxxxxxxxx>
Cc: pgsql-performance@xxxxxxxxxxxxxx
Sent: Friday, August 17, 2012 7:33 PM
Subject: Re: Index Bloat Problem
On Thu, Aug 16, 2012 at 12:57 PM, Strahinja Kustudić
<strahinjak@xxxxxxxxxxx> wrote:
>
> @Jeff I'm not sure if I understand what you mean? I know that we never reuse
> key ranges. Could you be more clear, or give an example please.
If an index leaf page is completely empty because every entry on it
were deleted, it will get recycled to be used in some other part of
the index. (Eventually--it can take a while, especially if you have
long-running transactions).
But if the leaf page is only mostly empty, because only most of
entries on it were deleted, than it can never be reused, except for
entries that naturally fall into its existing key range (which will
never happen, if you never reuse key ranges)
So if you have a million records with keys 1..1000000, and do a
"delete from foo where key between 1 and 990000", then 99% of those
old index pages will become completely empty and eligible for reuse.
But if you do "delete from foo where key%100>0", then all of the pages
will become 99% empty, and none will be eligible for reuse (except the
very last one, which can still accept 1000001 and so on)
There has been talk of allowing logically adjacent, mostly empty
pages to be merged so that one of them becomes empty, but the way
concurrent access to btree indexes was designed this is extremely hard
to do safely.
Cheers,
Jeff
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance