On 2015-06-03 15:01:46 -0300, Alvaro Herrera wrote: > Andres Freund wrote: > > That's not necessarily the case though, given how the code currently > > works. In a bunch of places the SLRUs are accessed *before* having been > > made consistent by WAL replay. Especially if several checkpoints/vacuums > > happened during the base backup the assumed state (i.e. the mxacts > > checkpoints refer to) of the data directory soon after the initial > > start, and the state of pg_multixact/ won't necessarily match at all. > > Well, the log extract is quite simple; it says that as soon as the > standby starts replay WAL, it fetches exactly one WAL segment, which > contains exactly one online checkpoint record; this record contains > oldestMulti=4624559, which belongs in offset file 0046. But the > basebackup only contains files starting from 004B onwards. The base > backup takes a long time, so my guess of what happened is that the > backup label points to the start of the WAL stream (obviously), then the > data files are copied to the standby; during this long process, a > multixact truncation happens. So by the time the base backup reaches > the pg_multixact directory, the 0046 file has been removed. Yes. That's precisely what I was concerned about above and elsewhere in the thread. It's simply not ok to access a SLRU while not yet consistent... > One idea I had was: what if the oldestMulti pointed to another multi > earlier in the same 0046 file, so that it is read-as-zeroes (and the > file is created), and then a subsequent multixact truncate tries to read > a later page in the file. In SlruPhysicalReadPage() this would give a > change for open() to not fail, and then read() can fail as above. > However, we would have an earlier LOG message about "reading as zeroes". > > Really, the whole question of how this code goes past the open() failure > in SlruPhysicalReadPage baffles me. I don't see any possible way for > the file to be created ... Wouldn't a previous WAL record zeroing another part of that segment explain this? A zero sized segment pretty much would lead to this error, right? Or were you able to check how things look after the failure? > 2015-05-27 16:15:17 UTC [4782]: [3-1] user=,db= LOG: entering standby mode > 2015-05-27 16:15:18 UTC [4782]: [4-1] user=,db= LOG: restored log file "00000001000073DD000000AD" from archive > 2015-05-27 16:15:18 UTC [4782]: [5-1] user=,db= FATAL: could not access status of transaction 4624559 > 2015-05-27 16:15:18 UTC [4782]: [6-1] user=,db= DETAIL: Could not read from file "pg_multixact/offsets/0046" at offset 147456: Success. > 2015-05-27 16:15:18 UTC [4778]: [4-1] user=,db= LOG: startup process (PID 4782) exited with exit code 1 > 2015-05-27 16:15:18 UTC [4778]: [5-1] user=,db= LOG: aborting startup due to startup process failure >From this isn't not entirely clear where this error was triggered from. Greetings, Andres Freund -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general