On 02/16/2017 04:39 PM, Richard Brosnahan wrote:
Hi all, Way back in December I posted a question about mirroring from an RPM installed PostgreSQL (binary) to a source built PostgreSQL, with the same version (9.4.1 --> 9.4.1). Both servers are running OEL6.
I went back to the previous threads and I could not find if you ever said whether the two systems are using the same hardware architecture or not? Vincent Veyron asked but I can't find a response.
I won't copy the entire thread from before, as the situation has changed a bit. The biggest changes are that I have root on the slave, temporarily, and I've installed PostgreSQL on the slave using yum (also binary). I've followed all the instructions found here: https://www.postgresql.org/docs/9.4/static/warm-standby.html#STREAMING-REPLICATION The slave is running PostgreSQL 9.4.11 and was installed using yum. It runs fine after I've run initdb and set things up. The master was also installed from rpm binaries, but the installers used Puppet. That version is 9.4.1. Yes, I know I should be using the exact same version, but I couldn't find 9.4.1 in the PostgreSQL yum repo. When I replace its data directory as part of the mirroring instructions, using pg_basebackup, PostgreSQL won't start. I used pg_basebackup. I get a checksum error, from pg_ctl. 2016-12-15 08:27:14.520 PST >FATAL: incorrect checksum in control file Previously, Tom Lane suggested I try this: You could try using pg_controldata to compare the pg_control contents; it should be willing to print field values even if it thinks the checksum is bad. It would be interesting to see (a) what the master's pg_controldata prints about its pg_control, (b) what the slave's pg_controldata prints about pg_control from a fresh initdb there, and (c) what the slave's pg_controldata prints about the copied pg_control. For Tom's requests (a and b), I can provide good output from pg_controldata from the master with production data, and from the slave right after initdb. I'll provide that on request. for Tom's request (c) I get this from the slave, after data is copied. $ pg_controldata WARNING: Calculated CRC checksum does not match value stored in file. Either the file is corrupt, or it has a different layout than this program is expecting. The results below are untrustworthy. Segmentation fault (core dumped) With this new installation on the slave, same result. core dump Tom Lane then suggested: $ gdb path/to/pg_controldata gdb> run /apps/database/postgresql-data (wait for it to report segfault) gdb> bt Since I now have gdb, I can do that: $ gdb /usr/pgsql-9.4/bin/pg_controldata -bash: gdb: command not found -bash-4.1$ gdb /usr/pgsql-9.4/bin/pg_controldata GNU gdb (GDB) Red Hat Enterprise Linux (7.2-90.el6) Copyright (C) 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /usr/pgsql-9.4/bin/pg_controldata...(no debugging symbols found)...done. Missing separate debuginfos, use: debuginfo-install postgresql94-server-9.4.11-1PGDG.rhel6.x86_64 (gdb) run /var/lib/pgsql/9.4/data Starting program: /usr/pgsql-9.4/bin/pg_controldata /var/lib/pgsql/9.4/data WARNING: Calculated CRC checksum does not match value stored in file. Either the file is corrupt, or it has a different layout than this program is expecting. The results below are untrustworthy. Program received signal SIGSEGV, Segmentation fault. 0x00000033d20a3a15 in __strftime_internal () from /lib64/libc.so.6 (gdb) bt #0 0x00000033d20a3a15 in __strftime_internal () from /lib64/libc.so.6 #1 0x00000033d20a5a36 in strftime_l () from /lib64/libc.so.6 #2 0x00000000004015c7 in ?? () #3 0x00000033d201ed1d in __libc_start_main () from /lib64/libc.so.6 #4 0x0000000000401349 in ?? () #5 0x00007fffffffe518 in ?? () #6 0x000000000000001c in ?? () #7 0x0000000000000002 in ?? () #8 0x00007fffffffe751 in ?? () #9 0x00007fffffffe773 in ?? () #10 0x0000000000000000 in ?? () (gdb) pg_controldata shouldn't be core dumping. Should I give up trying to use 9.4.1 and 9.4.11 as master/slave? My options appear to be 1 upgrade the master to 9.4.11, which will be VERY DIFFICULT given its Puppet install, and the difficulty I have getting root access to our servers. 2 Downgrade the slave. This is easier than option 1, but I would need to find a yum repo that has that version. 3 Make what I have work, somehow. Any assistance would be greatly appreciated! -- Richard Brosnahan
-- Adrian Klaver adrian.klaver@xxxxxxxxxxx -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general