On Tue, Dec 5, 2017 at 5:43 PM, Jeff Janes <jeff.janes@xxxxxxxxx> wrote: > On Mon, Dec 4, 2017 at 5:52 PM, John R Pierce <pierce@xxxxxxxxxxxx> wrote: >> On 12/4/2017 2:21 PM, chris kim wrote: >>> How would I investigate if my database is nearing a transaction wrap >>> around. >> >> it would be screaming bloody murder in the log, for one. >> > > Unfortunately, that comes far too late to repair the problem without a > substantial service interruption, on very high transaction throughput > installations. > > Also, people usually consult the logs to figure out what the problem is, > once they become aware that one exists. That is also too late. The problem is that our logic (1) focuses on when we should *start* freezing, not by when we'd like to be finished, and (2) is defined in such a way that many tables are likely to reach the trigger point at the same time. Even if your system can handle the load, you might not like the disruption to regular vacuuming and analyze work. An ideal system would estimate how long it's going to take and how long we've got (current tx consumption rate, xids remaining) before autovacuum_freeze_max_age is reached and then spread the work out so that we get it done just in time with minimal impact. Getting reliable estimates to control that seems hard though. Perhaps we could add a much simpler first defence that tries to prevent autovacuum_freeze_max_age (and its multixact cousin) from being reached like this: consider launching at most one wraparound vacuum for any relation that is *half way* to autovacuum_freeze_max_age. That gives the system a chance to handle each partition of a monster partitioned table calmly in series even if they have the same age dating back to schema creation/data load time. Perhaps it could consider adding more vacuum backends as you approach autovacuum_freeze_max_age, or something. Hopefully you'd never actually reach it. Of course you can do what I just said with a cron job, and there may be better heuristics than that, but it'd be nice to find *some* way to make freeze max age more gradual by default on large databases, until such time as we can kill it with 64 bit xids or other major efforts. My understanding is that even with the new freeze map, most big systems will still pay the full price for the first wraparound vacuum freeze, so I still expect to encounter 20TB production databases in the wild that have gone into a wraparound frenzy confounding their owners. -- Thomas Munro http://www.enterprisedb.com