Robert Haas wrote: > On Fri, Jun 5, 2015 at 2:20 AM, Noah Misch <noah@xxxxxxxxxxxx> wrote: > > On Thu, Jun 04, 2015 at 05:29:51PM -0400, Robert Haas wrote: > >> Here's a new version with some more fixes and improvements: > > > > I read through this version and found nothing to change. I encourage other > > hackers to study the patch, though. The surrounding code is challenging. > > Andres tested this and discovered that my changes to > find_multixact_start() were far more creative than intended. > Committed and back-patched with a trivial fix for that stupidity and a > novel-length explanation of the changes. I think novel-length is fine. The bug itself is pretty complicated, and so is the solution. Many thanks for working through this. FWIW I tested with the (attached) reproducer script(*) for my customer's problem, and it works fine now where it failed before. One thing which surprised me a bit, but in hindsight should have been pretty obvious, is that the "multixact member protections are fully armed" message is only printed once the standby gets out of recovery, instead of when it reaches consistent state or some such earlier point. (*) Actually the script cheats to get past an issue, which I couldn't actually figure out, that a file can't be "seeked"; I just do a "touch" to create an empty file there, which causes the same error situation as on my customer's log. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment:
repro-chkpt-replay-failure.sh
Description: Bourne shell script
-- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general