On Tue, May 30, 2017 at 9:32 PM, Magnus Hagander <magnus@xxxxxxxxxxxx> wrote: > On Tue, May 30, 2017 at 9:14 PM, Ludovic Vaugeois-Pepin > <ludovicvp@xxxxxxxxx> wrote: >> >> I ran into the issue described below with 10.0 beta. The error I got is: >> >> pg_basebackup: could not create temporary replication slot >> "pg_basebackup_2194": ERROR: replication slot "pg_basebackup_2194" >> already exists >> >> A race condition? Or maybe I am doing something wrong. >> >> >> >> >> >> Release: >> Name : postgresql10-server >> Version : 10.0 >> Release : beta1PGDG.rhel7 >> >> >> Test Type: >> Functional testing of a pacemaker resource agent >> (https://github.com/ulodciv/pgha) >> >> >> Test Detail: >> During context/environement setup, pg_basebackup is invoked (in >> parallel) from multiple virtual machines. The backups are then started >> as asynchronously replicated hot standbies. >> >> >> Platform: >> Centos 7.3 >> >> >> Installation Method: >> yum -y install >> >> https://download.postgresql.org/pub/repos/yum/testing/10/redhat/rhel-7-x86_64/pgdg-redhat10-10-1.noarch.rpm >> yum -y install postgresql10-server postgresql10-contrib >> >> >> Platform Detail: >> >> >> Test Procedure: >> >> Have pg_basebackup run simultaneously on multiple hosts against >> the same instance eg: >> >> pg_basebackup -h test4 -p 5432 -D /var/lib/pgsql/10/data -U repl1 >> -Xs >> >> >> Failure? >> >> E deploylib.deployer_error.DeployerError: >> postgres@test5: got exit status 1 for: >> E pg_basebackup -h test4 -p 5432 -D >> /var/lib/pgsql/10/data -U repl1 -Xs >> E stderr: pg_basebackup: could not create temporary >> replication slot "pg_basebackup_2194": ERROR: replication slot >> "pg_basebackup_2194" already exists >> E pg_basebackup: child process exited with error 1 >> E pg_basebackup: removing data directory >> "/var/lib/pgsql/10/data" >> >> >> Test Results: >> >> >> Comments: >> This seems to be new with 10. I recently began testing the >> pacemaker resource agent against PG 10. I never had (or noticed) this >> failure with 9.6.1 and 9.6.2. > > > Hah, that's an interesting failure. In the name of the slot, the 2194 comes > from the pid -- but it's the pid of pg_basebackup. > > I assume you're not running the two pg_basebackup processes on the same > machine? Is it predictable when this happens (meaning that the pid value is > actually predictable), or do you have to run it a large numbe rof times > before it happens? Indeed, I run it from two VMs that were created from the same .ova (packaged VM). I ran into this once, however I have been running tests on 10.0 for a couple of days or so. My guess is that the two hosts ended up using the same pid when running the backup. -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general