Re: Terribly slow query with very good plan?

Les <nagylzs@xxxxxxxxx> · Fri, 4 Feb 2022 11:00:47 +0100

I really think now that the query plan is wrong (or "could be improved" so to say). As far as I understand, the "index only scan" is essentially a sequential scan on the index data. In this specific case, where the filter is a "begins with" condition on a field that is the starting (and only) column of an index, there is a much much better way to find out if there is a row or not: lookup the closest value in the index and see if it begins with the value. The operation of looking up the closest value in an index would be much more efficient.

> I don't understand how it is possible in the slow case  Rows Removed by Filter: 792025 (returns 0 row) and in the second case Rows Removed by Filter: 15 (returns 1 row).
Pavel, I think it is because the scan found a suitable row at the beginning of the scan and stopped the scan. If you look at that plan you will see that it uses a seq scan. It was fast by accident. :-)

The plan of that single-row version was changed to a normal index scan, after I added the collation "C" index:

EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)
select exists ( 
	select id from media.oo_file of2  where relpath like 'this does not exist%'
);
QUERY PLAN                                                                                                                           |
-------------------------------------------------------------------------------------------------------------------------------------+
Result  (cost=0.63..0.64 rows=1 width=1) (actual time=0.022..0.023 rows=1 loops=1)                                                   |
  Output: $0                                                                                                                         |
  Buffers: shared hit=4                                                                                                              |
  InitPlan 1 (returns $0)                                                                                                            |
    ->  Index Only Scan using test on media.oo_file of2  (cost=0.55..8.57 rows=108 width=0) (actual time=0.018..0.018 rows=0 loops=1)|
          Index Cond: ((of2.relpath >= 'this does not exist'::text) AND (of2.relpath < 'this does not exisu'::text))                 |
          Filter: (of2.relpath ~~ 'this does not exist%'::text)                                                                      |
          Heap Fetches: 0                                                                                                            |
          Buffers: shared hit=4                                                                                                      |
Planning Time: 0.530 ms                                                                                                              |
Execution Time: 0.055 ms                                                                                                             |

I would expect for the same originally slow query with the has_file column, but it does not happen. :-(