Re: Index bloat and REINDEX/VACUUM optimization for partial index

jayaprabhakar k <jayaprabhakar@xxxxxxxxx> · Wed, 30 Aug 2023 17:42:58 -0700

On Tue, Aug 29, 2023, 12:43 PM Jeff Janes <jeff.janes@xxxxxxxxx> wrote:
On Mon, Aug 28, 2023 at 8:33 PM jayaprabhakar k <jayaprabhakar@xxxxxxxxx> wrote:
Hi,
TL;DR: 
Observations:
REINDEX requires a full table scan
Roughly create a new index, rename index, drop old index.
REINDEX is not incremental. running reindex frequently does not reduce the future reindex time.
REINDEX does not use the index itself
VACUUM does not clean up the indices. (relpages >> reltuples) I understand, vacuum is supposed to remove pages only if there are no live tuples in the page, but somehow, even immediately after vacuum, I see relpages significantly greater than reltuples. I would have assumed, relpages <= reltuples 
Query Planner does not consider index bloat, so uses highly bloated partial index that is terribly slow over other index
Your points 3 and 4 are not correct.  empty index pages are put on a freelist for future reuse, they are not physically removed from the underlying index files.  Maybe they are not actually getting put on the freelist or not being reused from the freelist for some reason, but that would be a different issue.  Use the extension pgstattuple to see what its function pgstatindex says about the index.  

The planner does take index bloat into consideration, but its effect size is low.  Which it should be, as empty or irrelevant pages should be efficiently skipped during the course of most index operations. To figure out what is going with your queries, you should do an EXPLAIN (ANALYZE, BUFFERS) of them, but with it being slow and with it being fast.

Question: Is there a way to optimize postgres vacuum/reindex when using partial indexes?

Without knowing what is actually going wrong, I can only offer generalities.  Make sure you don't have long-lived transactions which prevent efficient clean up.  Increase the frequency on which vacuum runs on the table.  It can't reduce the size of an already bloated index, but by keeping the freelist stocked it should be able prevent it from getting bloated in the first place.  Also, it can remove empty pages from being linked into the index tree structure, which means they won't need to be scanned even though they are still in the file.  It can also free up space inside non-empty pages for future reuse within that same page, and so that index tuples don't need to be chased down in the table only to be found to be not visible.

```
SELECT [columns list]
  FROM tasks
  WHERE status NOT IN (3,4,5) AND created > NOW() - INTERVAL '30 days' AND updated < NOW() - interval '30 minutes'
```

Since we are only interested in the pending tasks, I created a partial index
 `"tasks_pending_status_created_type_idx" btree (status, created, task_type) WHERE status <> ALL (ARRAY[3, 4, 5])`.

This looks like a poorly designed index.  Since the status condition exactly matches the index where clause, there is no residual point in having "status" be the first column in the index, it can only get in the way (for this particular query).  Move it to the end, or remove it altogether.
Interesting. I don't understand why it will get in the way. Unfortunately we have a few other cases where status is used in filter. That said, I will consider how to get this to work. 
Would removing status from the index column, improve HOT updates %? For example, changing status from 1->2, doesn't change anything on the index (assuming other criteria for HOT updates are met), but I am not sure how the implementation is.

Within the tuples which pass the status check, which inequality is more selective, the "created" one or "updated" one?
Obviously updated time is more selective (after status), and the created time is included only to exclude some bugs in our system that had left some old tasks stuck in progress (and for sorting). We do try to clean up occasionally, but not each time. 
However we cannot add an index on `updated` column because that timestamp gets updated over 10x on average for each task. Since if a single index use a column, then the update will not be HOT, and every index needs to be updated. That will clearly add a bloat to every index. Did I miss something?

Cheers,

Jeff