On Thu, Aug 31, 2023 at 11:06 AM Maxim Boguk <maxim.boguk@xxxxxxxxx> wrote:
With the default value ofautovacuum_vacuum_scale_factor (
The default is 0.2 (20% of table size).) index will collect like 100M outdated/dead index entries before autovacuum kicks in and cleans them all (in a worst case), and of course it will lead to huge index bloat and awful performance.
Index bloat doesn't automatically lead to awful performance. There must be some additional factor at play.
Even if you scale downautovacuum_vacuum_scale_factor to some unreasonable low value like 0.01, the index still bloats to the 5M dead entries before autovacuum run, and constant vacuuming of a huge 500M table will put a huge load on the database server.
Unfortunately there is no easy way out of this situation from database side, in general I recommend not trying to implement a fast pacing queue like load inside of a huge and constantly growing table, it never works well because you cannot keep up partial efficient indexes for the queue in a clean/non-bloated state.In my opinion the best solution is to keep list of entries to process ("around 1000-1500 tasks in pending statuses") duplicated in the separate tiny table (via triggers or implement it on the application level), in that case autovacuum will be able quickly clean dead entries from the index.
You should be able to use declarative partitioning to separate the "final" tuples from the "active" tuples, to get the same benefit but with less work.
Cheers,
Jeff