On Tue, Aug 20, 2019 at 2:44 AM Mariel Cherkassky <mariel.cherkassky@xxxxxxxxx> wrote:
Hey,I have 2 db nodes(9.6) configured with streaming replication (+repmgr). Suddenly ysterday my secondary stopped syncing and I saw the following error in the log :invalid record length at X/YYYYY: wanted 24, got
Did it really just end the message with "got"?
My next idea is using pg_resetxlog in order to start the secondary successfully and then use pg_rewind to sync it again with the master. The master is working perfectly and there arent any issues on it.
Since you don't know what went wrong, I don't think I'd rely on pg_rewind to fix it. Also, while I haven't use pg_rewind, I think it requires the destination to be shut down while it runs. So pg_resetxlog would not be needed, and likely even harmful.
Right now, I'm not interested in taking a basebackup and creating the secondary from scratch..
Why not? Too much disk activity? Too much network traffic? If the latter, you could do a low level backup, using rsync in checksum mode as the file transfer method.
Cheers,
Jeff