Re: index bloat estimation

Keith Fiske <keith.fiske@xxxxxxxxxxxxxxx> · Fri, 12 Feb 2021 10:30:21 -0500

On Fri, Feb 12, 2021 at 3:26 AM Victor Sudakov <vas@xxxxxxxxxx> wrote:
Dear Colleagues,

What queries do you use to estimate index and table bloat?

I've researched some on the Net and found multiple scripts mentioned in

https://wiki.postgresql.org/wiki/Index_Maintenance#Index_Bloat, also

in https://github.com/pgexperts/pgx_scripts etc. 

Most of the stuff I've looked at is pretty old, much seems unsupported.

What is the current best practice?

I'd be grateful if you could share your personal favourite ways of

estimating bloat.

-- 

Victor Sudakov,  VAS4-RIPE, VAS47-RIPN

2:5005/49@fidonet http://vas.tomsk.ru/

Why estimate when you can get the exact amount? At least for b-tree indexes anyway.
https://github.com/keithf4/pg_bloat_check

This script uses the pgstattuple extension to get both table and b-tree index bloat information. Since it's actually scanning the table, it can take longer than other queries that try and do estimates based on statistics. But it does give you very accurate information. You can also just use pgstattuple directly without this script, but you do have to run it individually on the table then each index. The script can scan the table and all its indexes in one step and give you a full summary.

https://www.postgresql.org/docs/13/pgstattuple.html

-- 
Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com