On Mon, 2024-02-19 at 23:08 +0000, Tim Palmer wrote: > I can identify indexes that are used directly during query execution using > pg_stat_user_indexes. However, it is possible that the optimizer might be > using the statistics on the indexed expressions to guide query planning. > If that is the case then I might remove an index which appears to be unused, > only to then find that this results in the optimizer choosing poorer plans > as it no longer has the additional statistics on the indexed expressions. > > How can I identify indexes which are used by the optimizer when planning? I don't think you can. > If there is nothing available inside the database, I was thinking I would > have to parse all of the queries executed on the database, extract the > expressions, work out which columns of which tables are referenced by > those expressions, and then check whether those expressions are indexed. Well, in an ideal world, you should know which indexes you created for which query. But then indexes might be used in ways you didn't intend. > Or (since that seems like a lot of work), I could remove all indexes which > are reported as unused by pg_stat_user_indexes, but for each and every one > of them create extended statistics on the relevant expression(s) using > CREATE STATISTICS. That might result in a lot of unnecessary expression > statistics - are there any significant costs associated with that? The > docs say "providing benefits similar to an expression index without the > overhead of index maintenance" [1], and don't mention any downside. The small downside is that ANALYZE will have more work, the big downside is that query planning on tables with advanced statistics takes slightly longer. That may or may not be a problem for you, but if you could pay the price of the extra indexes, you will certainly be able to pay the price of advanced statistics. You could also consider the option to drop indexes and see if somebody complains, or watch out for changes in pg_stat_statements. Yours, Laurenz Albe