Re: Unexpectedly high disk space usage RESOLVED (Manual reindex/vacuum)

Lists <lists@xxxxxxxxxxxxxxxxxx> · Fri, 09 Nov 2012 16:28:50 -0800

As I've spent a considerable amount of time trying to sort this out, I'm 
posting it for the benefit other users.

I've experienced persistent, ongoing issues with autovacuum in a mixed 
read/write environment with midrange hardware (16 core Xeon, 128 GB RAM, 
200 GB SATA3 6 Gb SSDs for disk I/O on RHEL6) using the defaults in the 
9.1 RPMs provided by Postgres. (yum.postgresql.org) The cause of this is 
not yet determined. It may be related to the any or all of the 
combination of:

    A) extensive use of temp tables;
    B) extensive use of multiple dblink() calls in a single query;
    C) use of transactions, especially prepared transactions and 
multiple savepoints;
    D) concurrent use of pg_dump;
    E) use of numerous databases on a single server, average about 50;

To offset this, we turned off autovacuum, and used an old script to 
vacuum the tables in the middle of the night when nobody was looking. 
Unfortunately, the vacuum script only vacuumed the "userland" tables and 
tremendous amounts of disk space were being wasted, particularly in the 
pg_attribute tables.

However, use of any of the statements "vacuum analyze", "vacuum full 
analyze", "vacuum full verbose analyze" without mentioning specific 
tables did not resolve the extra disk space used issue, disk usage still 
remained at least 5x the expected amount in all cases. (in one case, use 
of all of these open-ended vacuum queries did almost nothing)

Nor did running any variation of "vacuum analyze $table" in a loop thru 
all tables (including the pg_* tables) completely resolve the issue, 
either.

In order to completely clean things up, we ended up writing a script do 
the following:

1) Determine the databases using excessive disk space, in descending 
order of use with this query:

SELECT
d.datname as Name,
d.datistemplate::int AS datistemplate,
pg_catalog.pg_get_userbyid(d.datdba) as Owner,
CASE WHEN pg_catalog.has_database_privilege(d.datname, 'CONNECT')
    THEN pg_catalog.pg_database_size(d.datname)
    ELSE -1
    END as Size
FROM pg_catalog.pg_database d
    order by
        CASE WHEN pg_catalog.has_database_privilege(d.datname, 'CONNECT')
        THEN pg_catalog.pg_database_size(d.datname)
        ELSE NULL
        END desc nulls first;

2) For each database from #1, get a list of tables to be cleaned up with 
this query:

SELECT
    nspname || '.' || relname AS "table",
    pg_total_relation_size(C.oid) AS "size"
FROM
    pg_class C
LEFT JOIN pg_namespace N ON
    (N.oid = C.relnamespace) WHERE (true or (nspname NOT IN 
('pg_catalog', 'information_schema')))
    AND C.relkind = 'r'
    AND nspname !~ '^pg_toast'
ORDER BY
    pg_total_relation_size(C.oid) DESC;

3) For each of the tables from #2, run the commands
REINDEX TABLE $table;
VACUUM FULL ANALYZE $table;

The end result is a squeaky-clean database server with expected disk usage.

NOTES:

1) The above queries are derived from queries found to determine how 
much disk space was used, even though the additional information 
provided isn't actually used by the script.

2) It was sheer chance that I discovered the need to reindex prior to 
vacuum in order to get the disk space back.

3) I'd like to get autovacuum to work. I've read suggestions to tweak 
cost_delay and/or cost_limit. I haven't yet determined if the problem is 
I/O based or lock/deadlock based. I'm guessing the problem is the 
latter, though it's hard to tell because queries stack up quickly and 
load average is sky high when autovacuum fails for us.

4) The aforementioned process is S-L-O-W. Think at least hours and 
probably days depending on your databases, your server(s), and the load.

5) I don't yet know if the "full" option for the vacuum is necessary to 
free up all space. I will experiment with this and post results if useful.

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general