Re: Index bloat and REINDEX/VACUUM optimization for partial index

jayaprabhakar k <jayaprabhakar@xxxxxxxxx> · Thu, 31 Aug 2023 18:05:59 -0700

Thanks Maxim, that's something we are considering now - keep the in progress tasks in one table and periodically move the old and completed tasks to an archive table.We could use a view that unions them for most queries.

I'm not sure if that's the best alternative though, and we want to know if there are any gotchas to worry about.

On Thu, Aug 31, 2023, 8:06 AM Maxim Boguk <maxim.boguk@xxxxxxxxx> wrote:

At any moment, there are around 1000-1500 tasks in pending statuses (Init + InProgress) out of around 500 million tasks. 

Now, we have a task monitoring query that will look for all pending tasks that have not received any update in the last n minutes.

```
SELECT [columns list]
  FROM tasks
  WHERE status NOT IN (3,4,5) AND created > NOW() - INTERVAL '30 days' AND updated < NOW() - interval '30 minutes'
```

Since we are only interested in the pending tasks, I created a partial index
 `"tasks_pending_status_created_type_idx" btree (status, created, task_type) WHERE status <> ALL (ARRAY[3, 4, 5])`.

This worked great initially, however this started to get bloated very very quickly because, every task starts in pending state, gets multiple updates (and many of them are not HOT updates, working on optimizing fill factor now), and eventually gets deleted from the index (as status changes to success). 

From my experience I suspect that there is a problem with "of around 500 million tasks."
Autovacuum indeed cleans old dead index entries, but how many such dead index entries will be collected on the 500M table before autovacuum kicks in?

With the default value of autovacuum_vacuum_scale_factor (The default is 0.2 (20% of table size).) index will collect like 100M outdated/dead index entries before autovacuum kicks in and cleans them all (in a worst case),  and of course it will lead to huge index bloat and awful performance.

Even if you scale down autovacuum_vacuum_scale_factor to some unreasonable low value like 0.01, the index still bloats to the 5M dead entries before autovacuum run, and constant vacuuming of a huge 500M table will put a huge load on the database server.

Unfortunately there is no easy way out of this situation from database side, in general I recommend not trying to implement a fast pacing queue like load inside of a huge and constantly growing table, it never works well because you cannot keep up partial efficient indexes for the queue in a clean/non-bloated state.

In my opinion the best solution is to keep list of entries to process ("around 1000-1500 tasks in pending statuses") duplicated in the separate tiny table (via triggers or implement it on the application level), in that case autovacuum will be able quickly clean dead entries from the index.

Kind Regards,
Maxim

-- 
Maxim Boguk
Senior Postgresql DBA

Phone UA: +380 99 143 0000
Phone AU: +61  45 218 5678