On Thu, May 28, 2015 at 8:01 AM, Robert Haas <robertmhaas@xxxxxxxxx> wrote: > On Wed, May 27, 2015 at 6:21 PM, Alvaro Herrera > <alvherre@xxxxxxxxxxxxxxx> wrote: >> Steve Kehlet wrote: >>> I have a database that was upgraded from 9.4.1 to 9.4.2 (no pg_upgrade, we >>> just dropped new binaries in place) but it wouldn't start up. I found this >>> in the logs: >>> >>> waiting for server to start....2015-05-27 13:13:00 PDT [27341]: [1-1] LOG: >>> database system was shut down at 2015-05-27 13:12:55 PDT >>> 2015-05-27 13:13:00 PDT [27342]: [1-1] FATAL: the database system is >>> starting up >>> .2015-05-27 13:13:00 PDT [27341]: [2-1] FATAL: could not access status of >>> transaction 1 >> >> I am debugging today a problem currently that looks very similar to >> this. AFAICT the problem is that WAL replay of an online checkpoint in >> which multixact files are removed fails because replay tries to read a >> file that has already been removed. > > Wait a minute, wait a minute. There's a serious problem with this > theory, at least in Steve's scenario. This message: > > 2015-05-27 13:13:00 PDT [27341]: [1-1] LOG: database system was shut > down at 2015-05-27 > > That message implies a *clean shutdown*. If he had performed an > immediate shutdown or just pulled the plug, it would have said > "database system was interrupted" or some such. > > There may be bugs in redo, also, but they don't explain what happened to Steve. > > Steve, is there any chance we can get your pg_controldata output and a > list of all the files in pg_clog? Err, make that pg_multixact/members, which I assume is at issue here. You didn't show us the DETAIL line from this message, which would presumably clarify: FATAL: could not access status of transaction 1 -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general