On 04/03/2015 08:59 AM, John Scalia wrote:
Hi all,
I have a question regarding vacuum and locking, but since I currently don't have access to my servers (maintenance activities), I can't verify on my own. I know that VACUUM has at
least four variants. OK, actually two with ability to add ANALYZE after the operation. Here we go:
Start here: http://www.postgresql.org/docs/9.4/static/sql-vacuum.html
VACUUM (VACUUM ANALYZE) - shouldn't need to lock the table(?). I'm looking at the Douglas' book "PostgreSQL" on page 804 where it claims the operation does not require exclusive
access.
True, but the increase in I/O *can* impact overall server operation.
Note that the typical default install includes the autovacuum which will
automatically run the vacuum process when necessary. Vacuuming is a
critical part of maintenance for a number of reasons and not running
vacuum is definitely discouraged. Vacuum essentially marks released
space within the files containing the data as reusable. Vacuum does not
free up disk space.
VACUUM FULL (VACUUM FULL ANALYZE) - will need to lock the table, but I'm thinking this may only take a couple of milliseconds as I think this builds a temporary table from the one
being vacuumed and then when complete, drops the old table and renames the temporary. So the lock should only be during this move, or does it lock the table for the entirety?
No, it locks for the duration which can potentially be a very long time
as it is physically reshuffling the data to shrink the on-disk space
usage. This is typically reserved for situations that cause excessive
table bloat such as full-table updates or deletes of large portions of
the table *and* when there is insufficient disk-space to use an
alternative method of compacting.
Look also at CLUSTER
(http://www.postgresql.org/docs/9.4/static/sql-cluster.html). Unlike
VACUUM FULL, CLUSTER *does* write the data to a new disk files but is
far faster than VACUUM and can reorder the table data to match an index
which can improve the performance of certain queries. CLUSTER requires
sufficient available disk-space to write the new copy of the table and
associate indexes which can be anywhere from the size of your original
table/indexes if it is already packed to far less if the table is
heavily bloated.
N.B. The operation of CLUSTER and VACUUM have changed from version to
version so read the docs for your version. For example, CLUSTER used to
require an index and would reorder using that index. This requirement
has been dropped in the latest version.
CLUSTER also requires and exclusive lock so...
Check out the pg_reorg extension. This extension does create a new table
on the fly while tracking updates. It then applies any updates that have
occurred while creating the new copy and finally swaps the new version
in place of the old. It creates I/O but doesn't require the
long-duration locks of the other methods.
Is this correct? And how does it then resolve say another deletion in the table that occurs while the VACUUM is in operation? I hope that is clearly expressed.
Vacuum is just a maintenance operation. Changes to the table will be
dealt with in the next VACUUM.
Note, while you can run VACUUM ANALYZE together, VACUUM and ANALYZE do
different things and can be run independently. While VACUUM deals with
space usage and a few other issues, ANALYZE updates the table statistics
used by the planner.
Cheers,
Steve
--
Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin