Search Postgresql Archives

Re: Frequetly updated partial index leads to bloat on index for Postresql 11

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Tom-Tom:

On Fri, Jul 16, 2021 at 5:43 PM Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
> Tom Dearman <tom.dearman@xxxxxxxxx> writes:
> > We have change autovacuum so that it runs more frequently autovacuum_vacuum_scale_factor=0.035, the reason we have a partial index on the status is that in a table of 300 million entries, only about 100 or so would have status=‘IN_PROGRESS’ so we think this should be a nice small index and many of our queries want to look up with a where clause status=‘IN_PROGRESS’.  In theory it works well, but we get a lot of index bloat as there is a lot of churn on the status value, ie each row starts as IN_PROGRESS and then goes to one of 4 possible completed statuses.
> Is it really the case that only this index is bloating?  In principle, an
> update on a row of the table should result in new entries in every index
> of the table.  A partial index, due to the filter applied to possibly not
> store any index entry, should in theory have less bloat than other
> indexes.

May be not in relative terms. If I understand correctly, lets say you
start unbloated with 300M entries with 100 in progress. You insert, in
small batches, 10K rows "in_progress" and update 10k "in_progress" row
to, let's say, "done" ( may be including the 100 original ones ) (
description seems to fit this kind of flow, something like a task
queue ). Then you will have 10k dead tuples bloating a 100 live
entries index, for an absolutely small but relatively large bloat,
while a full pk index will have 10k for 300m.

I may be misunderstanding some thing and HOT and similar things may
help here, but it seems like a plausible explanation for an apparent
bloat ( on a small index, I had similar things but for task-like
things I use a pending and a done table, and the pending table bloated
a lot on the before-autovacuum times,  small table nearly all dead
tuples, not a problem vaccuming it via cron every some minutes, as in
this case only the index is bloated autovacuum may not fire frequently
enough for it, as the table would have only 10k/30M~333 ppm bloat  )

Francisco Olarte






[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux