On Sun, Feb 14, 2021 at 11:43 PM Victor Sudakov <vas@xxxxxxxxxx> wrote:
Keith Fiske wrote:
> >
> > What queries do you use to estimate index and table bloat?
[dd]
> Why estimate when you can get the exact amount? At least for b-tree indexes
> anyway.
>
> https://github.com/keithf4/pg_bloat_check
Hello Keith,
Thanks for this script and for reminding about pgstattuple.
The script has the drawback of requiring r/w access to the database, so
it cannot be run on a replica. But the idea is excellent. At least it
can show the top N bloated relations, and then I can further explore with
pgstattuple(), pgstatindex() etc.
--
Victor Sudakov, VAS4-RIPE, VAS47-RIPN
2:5005/49@fidonet http://vas.tomsk.ru/
I could possibly see about letting this actually run against the replica, however this can be a rather long running transaction depending on the size of the tables involved. You can set the "--commit_rate" to avoid some of that, but if you have really large tables, it can still run quite long. So this sort of check is really best run against the primary to avoid issues around having to allow long running queries on the replica (delayed replication or even worse bloat buildup).
However, once you get bloat under control, you likely shouldn't need to be running this often, especially against the entire database. If a few tables turn out to be problematic, you can make schedules just for them. And otherwise run an entire database scan at most maybe once a month during off-peak hours.