Re: Use of inefficient index in the presence of dead tuples

Laurenz Albe <laurenz.albe@xxxxxxxxxxx> · Tue, 28 May 2024 13:02:49 +0200

On Tue, 2024-05-28 at 10:00 +0200, Alexander Staubo wrote:
> I am encountering an odd problem where Postgres will use the wrong index, particularly if the table
> has some dead tuples. The database affected is running 12.6, but I can also reproduce with 16.3.
> 
> To reproduce:
> [create a table with a larger index on "id" and "receiver" and a smaller on
>  "receiver" and "created_at", then delete all but one row and ANALYZE]
> 
> (7) Try the following query:
> 
>     EXPLAIN (ANALYZE, VERBOSE, BUFFERS, COSTS, TIMING, SETTINGS, SUMMARY)
>     SELECT id FROM outbox_batches
>     WHERE receiver = 'dummy'
>     AND id = 'test';
> 
> Here's the query plan:
> 
>     Index Scan using outbox_batches_on_receiver_and_created_at on public.outbox_batches  (cost=0.38..8.39 rows=1 width=5) (actual time=0.426..984.038 rows=1 loops=1)
>     Output: id
>     Index Cond: (outbox_batches.receiver = 'dummy'::text)
>     Filter: (outbox_batches.id = 'test'::text)
>     Buffers: shared hit=3948 read=60742 dirtied=60741 written=30209
>     Settings: work_mem = '32MB'
>     Query Identifier: -2232653838283363139
>     Planning:
>     Buffers: shared hit=18 read=3
>     Planning Time: 1.599 ms
>     Execution Time: 984.082 ms
> 
> This query is reading 60K buffers even though it only needs to read a single row. Notice in particular the
> use of the index outbox_batches_on_receiver_and_created_at, even though outbox_batches_pkey would be
> a much better choice. We know this because if we drop the first index:
> 
>     Index Only Scan using outbox_batches_pkey on public.outbox_batches  (cost=0.50..8.52 rows=1 width=5) (actual time=2.067..2.070 rows=1 loops=1)
>     Output: id
>     Index Cond: ((outbox_batches.receiver = 'dummy'::text) AND (outbox_batches.id = 'test'::text))
>     Heap Fetches: 1
>     Buffers: shared hit=1 read=4
>     Settings: work_mem = '32MB'
>     Query Identifier: -2232653838283363139
>     Planning:
>     Buffers: shared hit=5 dirtied=1
>     Planning Time: 0.354 ms
>     Execution Time: 2.115 ms
> 
> This is also the index that's used in the normal case when there are no dead tuples at all.
> 
> Interestingly, the cost of an index only scan on outbox_batches_pkey is 8.52, whereas the other is
> 8.39. Is this because it considers the number of index pages? I've tried adjusting the various cost
> and memory settings, but they have no effect.

ANALYZE considers only the live rows, so PostgreSQL knows that the query will
return only few results.  So it chooses the smaller index rather than the one
that matches the WHERE condition perfectly.

Unfortunately, it has to wade through all the deleted rows, which is slow.

But try to execute the query a second time, and it will be much faster.
PostgreSQL marks the index entries as "dead" during the first execution, so the
second execution won't have to look at the heap any more.

See https://www.cybertec-postgresql.com/en/killed-index-tuples/

> In this test, we created 5M dead tuples. However, for me it also reproduces with just 1,000 rows.
> For such a small table, the performance degradation is minimal, but it increases as more and more
> tuples are deleted.
> 
> In a production environment, we have rows being constantly deleted at a high rate, leaving a table
> that often has very few live tuples, and often 500K+ dead tuples before autovacuum can kick in. Here
> I am consistently seeing the wrong index used, leading to poor performance.
> 
> The autovacuum settings ar aggressive, but for whatever reason it is not keeping up. We also have
> long-running transactions that sometimes cause the xmin to hang back for a while, preventing
> vacuums from helping.
> 
> All of that said, I would rather Postgres choose the right index than spend a lot of time optimizing
> vacuums.

I understand your pain, but your use case is somewhat unusual.

What I would consider in your place is
a) running an explicit VACUUM after you delete lots of rows or
b) using partitioning to get rid of old data

I don't know how the PostgreSQL optimizer could be improved to take dead rows into account.

Yours,
Laurenz Albe