Unable to startup postgres: Could not read from file "pg_clog/00EC"

"Nick Renders" <postgres@xxxxxxxxxx> · Wed, 05 Feb 2020 12:14:21 +0100

Hello,

Yesterday, we experienced some issues with our Postgres installation 
(v9.6 running on macOS 10.12).
It seems that the machine was automatically rebooted for a yet unknown 
reason, and afterwards we were unable to start the Postgres service.

The postgres log shows the following:

2020-02-04 15:20:41 CET LOG:  database system was interrupted; last 
known up at 2020-02-04 15:18:34 CET
2020-02-04 15:20:43 CET LOG:  database system was not properly shut 
down; automatic recovery in progress
2020-02-04 15:20:44 CET LOG:  invalid record length at 14A/9E426DF8: 
wanted 24, got 0
2020-02-04 15:20:44 CET LOG:  redo is not required
2020-02-04 15:20:44 CET FATAL:  could not access status of transaction 
247890764
2020-02-04 15:20:44 CET DETAIL:  Could not read from file "pg_clog/00EC" 
at offset 106496: Undefined error: 0.
2020-02-04 15:20:44 CET LOG:  startup process (PID 403) exited with exit 
code 1
2020-02-04 15:20:44 CET LOG:  aborting startup due to startup process 
failure
2020-02-04 15:20:44 CET LOG:  database system is shut down

After some searching, I found someone who had had a similar issue and 
was able to resolve it by overwriting the file in pg_clog.
So I tried the following command:

	dd if=/dev/zero of=[dbpath]/pg_clog/00EC bs=256k count=1

and now the service is running again.

But I am worried that there might still be some issues that we haven't 
noticed yet. I also have no idea what caused this error in the first 
place. It might have been the reboot, but maybe the reboot was a result 
of a Postgres issue.

Is there anything specific I should check in our postgres installation / 
database to make sure it is running ok now? Anyway to see what the 
consequences were of purging that one pg_clog file?

Best regards,

Nick Renders