On Tue, Dec 5, 2017 at 5:50 PM, Thomas Munro <thomas.munro@xxxxxxxxxxxxxxxx> wrote:
On Tue, Dec 5, 2017 at 5:43 PM, Jeff Janes <jeff.janes@xxxxxxxxx> wrote:
> On Mon, Dec 4, 2017 at 5:52 PM, John R Pierce <pierce@xxxxxxxxxxxx> wrote:
>> On 12/4/2017 2:21 PM, chris kim wrote:
>>> How would I investigate if my database is nearing a transaction wrap
>>> around.
>>
>> it would be screaming bloody murder in the log, for one.
>>
>
> Unfortunately, that comes far too late to repair the problem without a
> substantial service interruption, on very high transaction throughput
> installations.
>
> Also, people usually consult the logs to figure out what the problem is,
> once they become aware that one exists. That is also too late.
The problem is that our logic (1) focuses on when we should *start*
freezing, not by when we'd like to be finished, and (2) is defined in
such a way that many tables are likely to reach the trigger point at
the same time.
Isn't this only the case when you have many insert only-tables? Other tables are going to be vacuumed for wrap around at the first time they are vacuumed (for other reasons) after reaching vacuum_freeze_table_age - vacuum_freeze_min_age. That should be pretty well staggered because they probably have different update and delete rates. But, having those tables locked for an emergency vacuum which is not really an emergency is certainly a pain.
Even if your system can handle the load, you might not
like the disruption to regular vacuuming and analyze work.
An ideal system would estimate how long it's going to take and how
long we've got (current tx consumption rate, xids remaining) before
autovacuum_freeze_max_age is reached and then spread the work out so
that we get it done just in time with minimal impact. Getting
reliable estimates to control that seems hard though.
I think an ideal system wouldn't even have a autovacuum_freeze_max_age setting at all. What good does it do? According to the docs, its function is to limit the size of the clog files, not it to save your database from emergency shutdown.
Other more general problem which I think this brings up is:
A way to separate the log file stream into things that need immediate attention, versus things you want to have available in case you ever go looking for them. You could probably have some process monitor the log file and pull things it thinks need attention and send an email or SMS, but that is an ugly way to do it.
Also, why do we only start issuing warnings 10 million before wrap-around shutdown happens? On a very busy server, this is not enough warning. On a not-very-business server, why would it matter as you will never get anywhere near this point? Seems like we could increase this to 200 million with no down-side. (Maybe the first 190e6 of those would only warn once every 1024 transaction rather every 64).
Perhaps we could add a much simpler first defence that tries to
prevent autovacuum_freeze_max_age (and its multixact cousin) from
being reached like this: consider launching at most one wraparound
vacuum for any relation that is *half way* to
autovacuum_freeze_max_age. That gives the system a chance to handle
each partition of a monster partitioned table calmly in series even if
they have the same age dating back to schema creation/data load time.
Perhaps it could consider adding more vacuum backends as you approach
autovacuum_freeze_max_age, or something. Hopefully you'd never
actually reach it.
So in the simplest embodiment, the autovacuum launcher would keep track of the number of vacuum workers which were launched for "eager wrap around" purposes, and limit it to no more than one of those particular kinds at a time? Would a "real" wrap-around vacuum block a "eager" one?
Would such a worker consider itself to be an emergency and hold the table lock hostage, or would it yield the lock like regular vacuum workers do?
Of course you can do what I just said with a cron job, and there may
be better heuristics than that, but it'd be nice to find *some* way to
make freeze max age more gradual by default on large databases, until
such time as we can kill it with 64 bit xids or other major efforts.
My understanding is that even with the new freeze map, most big
systems will still pay the full price for the first wraparound vacuum
freeze, so I still expect to encounter 20TB production databases in
the wild that have gone into a wraparound frenzy confounding their
owners.
I don't think the problem is so much the frenzy (wrap around vacuums are throttled just as much as regular ones are, which is perhaps a bug not a feature) but rather the uninterruptible nature. Someone tries to add a column. The vacuum will not yield the field but continues to vacuum the table at a very leisurely pace. Now everything grinds to a halt, because an otherwise-momentary access exclusive lock can't be granted, and in turns blocks all access. If that is not bad enough, people can't tolerate (or even understand) this behavior, so they take steps to defeat it and so cause even more problems.
(Which in turn brings up another general issue. When one process wants a lock but is blocked by a weaker lock held by a slow process, it should allow other fast process to jump over it and get a weaker lock compatible with the held one. Identifying which processes are slow and which are likely to be fast is the problem. Some other products allow this indication to be provided by the user, but there may be better ways.)
Cheers,
Jeff