On Mon, May 23, 2016 at 6:46 PM, Alvaro Herrera <alvherre@xxxxxxxxxxxxxxx> wrote:
Natalie Wenz wrote:
> Question:
> When you cancel a vacuum freeze (or auto vacuum that is running because the table's max xid exceeded autovacuum_max_freeze_age) that's running on a large table, how much work is potentially lost?
>
> I ask because I have a vacuum freeze running in single-user mode on a database that is 46 TB, 35 TB of which is one table. I recently migrated it to 9.5.2 (on a new host), but because so much of the data is in one ridiculous table, I dumped the old database, excluding that big table, with pg_dump, restored everything else using pg_restore, and then used copy statements to dump and then load the data from the big table in chunks, so I could dump and reload that table with some parallelism. I got everything migrated, and started logging live data to it. Everything went well until a few weeks later, when the auto vacuum wasn't able to keep up with our transaction load. The database shut itself down and I got it running in single-user mode and started a vacuum freeze. After a while we realized (you probably know where this is going) that the vacuum has to freeze allllllll of the rows in the 35 TB table. I'm wondering if it would be worth it to shut it down again and retune for a more aggressive vacuum. I'm under the impression that some work would probably be lost, but is there a limit to how much?
The problem with freeze vacuums is that they need to scan the whole
table to completion. If you cancel one, the only thing you really lose
is the portion of the scan that has executed to completion thus far. If
you put a second one to work, it will start from scratch; depending on
how slow the original one was, it might catch up pretty quickly.
The second point to keep in mind is that if there are lots of dead
tuples that need to be cleaned up, scanning the indexes might take
considerable additional time. You want to avoid this as much as
possible by having the largest maintenance_work_mem you can afford
(keeping in mind that it's overridden by autovacuum_work_mem); but as I
recall the limit is hardcoded at 1GB which is rather unhelpful for very
large tables.
--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
If you haven't looked into it yet, you may want to look into partitioning that table if at all possible. While postgres can handle tables that large for the most part, you're now seeing the trouble that ongoing maintenance can sometimes cause with it. If there is a timestamp or id number that data can be partitioned by, I've written a tool that can make this a lot easier to manage.
https://github.com/keithf4/pg_partman
https://github.com/keithf4/pg_partman
For the most part, I use the method laid out in the documentation with some added features
http://www.postgresql.org/docs/9.5/static/ddl-partitioning.html
http://www.postgresql.org/docs/9.5/static/ddl-partitioning.html
If a trigger that redirects incoming data to the parent table would cause a degradation in performance, the table names used are predictable so you can script them pretty easily to insert directly to the child tables.