Re: Queries for unused/useless indexes

Melvin Davidson <melvin6925@xxxxxxxxx> · Fri, 22 May 2015 22:07:43 -0400

200 is a completely arbitrary value. At the time, I wanted to find indexes that were sufficiently less used than most others in a highly queried system. To find indexes that were never used, just change the value to 0.

On Fri, May 22, 2015 at 9:12 PM, Venkata Balaji N <nag1010@xxxxxxxxx> wrote:
On Fri, May 22, 2015 at 11:41 PM, Melvin Davidson <melvin6925@xxxxxxxxx> wrote:

Over the years I've wrote many scripts and queries to track the database status. Recently I've had to convince a client who thought it was a good idea to create indexes for every column on every table that it is really a bad idea. To do so, I wrote useless_indexes2.sql, which shows every index that has never been scanned. They still didn't believe me, so I wrote wasted_index_space.sql. That shows how much space is wasted by all the unused indexes. 

I'd like to share those queries with the community, as I know there must be others out there with the same problem.

I had a similar problem a few times in the past with some of our clients. I completely agree that it is not at all a good idea and we are simply inviting an extra over-head from maintenance and performance perspective.

Indexing every column of the table does not make sense as it is almost impossible that every column of the table can have rows with high cardinality. That's not typical RDBMS design.

Generally, most of them believe that, if an Index is unused, though its not beneficial at-least its not a harm. That is not correct.

- Depending on the data-types and cardinality of the columns, Indexes can occupy a lot of space and remain unused. This invites maintenance over-head 
  (ex: backups and vacuum operations)
- The biggest problem is, if the table is a heavy-write table, even though Indexes are not picked during SELECT, they cannot escape 
  WRITES (INSERTS/UPDATES). This is purely an extra and unnecessary I/O.

/* useless_indexes.sql */
SELECT 
       idstat.schemaname AS schema,
       idstat.relname AS table_name,
       indexrelname AS index_name,
       idstat.idx_scan AS times_used,
       pg_size_pretty(pg_relation_size(quote_ident(idstat.schemaname) || '.' || quote_ident(idstat.relname))) AS table_size, 
       pg_size_pretty(pg_relation_size(quote_ident(idstat.schemaname) || '.' || quote_ident(indexrelname))) AS index_size,
       n_tup_upd + n_tup_ins + n_tup_del as num_writes,
       indexdef AS definition
FROM pg_stat_user_indexes AS idstat 
JOIN pg_indexes ON indexrelname = indexname
JOIN pg_stat_user_tables AS tabstat ON idstat.relname = tabstat.relname
WHERE idstat.idx_scan < 200
AND indexdef !~* 'unique'
ORDER BY idstat.schemaname, 
         idstat.relname, 
         indexrelname;

Not sure why do you have "<200" 

Regards,
Venkata Balaji N

Fujitsu Australia

-- 
Melvin Davidson
I reserve the right to fantasize.  Whether or not you 
 wish to share my fantasy is entirely up to you.