Re: PostgreSQL does not choose my indexes well

Stephen Frost <sfrost@xxxxxxxxxxx> · Thu, 23 Apr 2020 13:18:58 -0400

Greetings,

* Tom Lane (tgl@xxxxxxxxxxxxx) wrote:
> "David G. Johnston" <david.g.johnston@xxxxxxxxx> writes:
> > On Thursday, April 23, 2020, Thomas Kellerer <shammat@xxxxxxx> wrote:
> >> Plus: scanning idx_tabla_entidad is more efficient because that index is
> >> smaller.
> 
> > Really?  The absence of 33 million rows in the partial index seems like it
> > would compensate fully and then some for the extra included columns.
> 
> On the other hand, an indexscan is likely to end up being effectively
> random-access rather than the purely sequential access involved in
> a seqscan.  

An indexscan is what was chosen though, so this doesn't really seem to
be a question of index scan vs. seq scan, it's a question of why one
index vs. another, though it seems a bit odd that we'd pick a regular
index scan instead of a BitmapHeap/Index scan.

> (If the index was built recently, then it might not be
> so bad --- but the planner doesn't know that, so it assumes that the
> index leaf pages are laid out pretty randomly.)  Moreover, unless the
> table is mostly marked all-visible, there will be another pile of
> randomized accesses into the heap to validate visibility of the index
> entries.

If the table *is* marked all visible, though, then certainly that index
will be better, and I think that's what a lot of this is coming down to
in this particular case.

Populating the tables provided based on the minimal info we got,
minimizing the numbers of pages that 'cod_tabla=4' is on:

insert into tabla select generate_series, 'abcdef' from generate_series(1,20);
insert into entidad select generate_series, 4, generate_series+1 from generate_series(1,1409985);
insert into entidad select generate_series+1409985, generate_series % 20 + 1, generate_series+1 from generate_series(1,34413354) where generate_series % 20 + 1 <> 4;
vacuum analyze entidad;

With this, the table is 1.7GB, idx_tabla_entidad is about 700MB, while
idx_entidad_tabla_4 is only 81MB.

With this, on v12-HEAD, PG will happily use the partial index:

=# explain (analyze, buffers) select * from entidad where cod_tabla = 4;
                                                                     QUERY PLAN                                                                      
-----------------------------------------------------------------------------------------------------------------------------------------------------
 Index Only Scan using idx_entidad_tabla_4 on entidad  (cost=0.43..55375.20 rows=1422085 width=24) (actual time=0.050..144.745 rows=1409985 loops=1)
   Heap Fetches: 0
   Buffers: shared hit=8497
 Planning Time: 0.338 ms
 Execution Time: 183.081 ms
(5 rows)

Dropping that index and then running it again shows:

=# explain (analyze, buffers) select * from entidad where cod_tabla = 4;
                                                                  QUERY PLAN                                                                  
----------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on entidad  (cost=26641.72..608515.59 rows=1422085 width=24) (actual time=102.844..242.522 rows=1409985 loops=1)
   Recheck Cond: (cod_tabla = 4)
   Heap Blocks: exact=8981
   Buffers: shared read=12838
   ->  Bitmap Index Scan on idx_tabla_entidad  (cost=0.00..26286.20 rows=1422085 width=0) (actual time=101.969..101.969 rows=1409985 loops=1)
         Index Cond: (cod_tabla = 4)
         Buffers: shared read=3857
 Planning Time: 0.264 ms
 Execution Time: 277.854 ms
(9 rows)

If we spread out where the 'cod_tabla=4' tuples are, the partial index
is still used (note that we end up with more like 1.7M tuples instead of
1.4M, but I don't think that's terribly relevant):

truncate entidad;
insert into entidad select generate_series, generate_series % 20 + 1, generate_series+1 from generate_series(1,34413354);

=# explain (analyze, buffers) select * from entidad where cod_tabla = 4;
                                                                     QUERY PLAN                                                                      
-----------------------------------------------------------------------------------------------------------------------------------------------------
 Index Only Scan using idx_entidad_tabla_4 on entidad  (cost=0.43..65231.31 rows=1664459 width=24) (actual time=0.036..185.171 rows=1720668 loops=1)
   Heap Fetches: 0
   Buffers: shared hit=10375
 Planning Time: 0.247 ms
 Execution Time: 233.205 ms
(5 rows)

Things get a lot worse when we drop that partial index:

drop index idx_entidad_tabla_4;

=# explain (analyze, buffers) select * from entidad where cod_tabla = 4;
                                                                  QUERY PLAN                                                                  
----------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on entidad  (cost=30088.12..270087.86 rows=1664459 width=24) (actual time=163.418..1465.733 rows=1720668 loops=1)
   Recheck Cond: (cod_tabla = 4)
   Heap Blocks: exact=219194
   Buffers: shared read=223609
   ->  Bitmap Index Scan on idx_tabla_entidad  (cost=0.00..29672.01 rows=1664459 width=0) (actual time=128.544..128.544 rows=1720668 loops=1)
         Index Cond: (cod_tabla = 4)
         Buffers: shared read=4415
 Planning Time: 0.094 ms
 Execution Time: 1515.066 ms
(9 rows)

To get the kind of plan that the OP got, I dropped random_page_cost to 1.0:

=# explain (analyze, buffers) select * from entidad where cod_tabla = 4;
                                                                   QUERY PLAN                                                                   
------------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using idx_tabla_entidad on entidad  (cost=0.56..251946.06 rows=1664459 width=24) (actual time=0.216..1236.371 rows=1720668 loops=1)
   Index Cond: (cod_tabla = 4)
   Buffers: shared read=223609
 Planning Time: 0.192 ms
 Execution Time: 1283.460 ms
(5 rows)

Even in that case though, when I recreate the partial index:

=# explain (analyze, buffers) select * from entidad where cod_tabla = 4;
                                                                     QUERY PLAN                                                                      
-----------------------------------------------------------------------------------------------------------------------------------------------------
 Index Only Scan using idx_entidad_tabla_4 on entidad  (cost=0.43..35033.31 rows=1664459 width=24) (actual time=0.039..211.644 rows=1720668 loops=1)
   Heap Fetches: 0
   Buffers: shared hit=7 read=10368
 Planning Time: 0.144 ms
 Execution Time: 256.304 ms
(5 rows)

So it's not really clear what's happening in the OP's case, we'd really
need more information to figure it out, it seems to me.

> Bottom line is that this choice is not nearly as open-and-shut as
> the OP seems to think.  In fact, it's fairly likely that this is a
> badly designed index, not a well-designed one that the planner is
> unaccountably failing to use.  Both covering indexes and partial
> indexes are easily-misused features that can make performance worse
> not better.

While I agree they can be mis-used, and that it's not open-and-shut,
it's not really clear to me what's going on that's causing us to avoid
that partial index in the OP's case when we'll certainly use it in
general.  The partial index in this particular case seems like it'd be
perfectly well suited to this query and that we should be using it (as
we are in the tests I did above).

I do wonder if we are maybe missing a bet at times though, considering
that I'm pretty sure we'll always go through the index in order, and
therefore randomly, even when we don't actually need the results in
order..?  Has there been much consideration for just opening an index
and sequentially scanning it in cases like this where we have to go
through all of the index anyway and don't need the results in order?  I
get that we'd still have to consider random access costs if the VM is
out of date, but if it's not, I would think we could give such an
approach a lower cost as we'd be going through the index sequentially
instead of the normal random access that we do.

Thanks,

Stephen
Attachment:
signature.asc

Description: PGP signature